Jump to content

Grab data from Website [Secure login required]


Recommended Posts

OK, I would LIKE to be able to make a workflow to grab my current data useage from my Cable provider's account page.

 

The initial LOGIN page is here: https://myaccount.cogeco.ca/acpub/login/

 

Now, they are using some sort of authentication to protect users, and I am VERY limited in my knowledge of curl in BASH.

 

As far as I can tell, I need to use curl twice, first to authenticate to one URL, then to login with another url ... perhaps using cookies?

 

Can someone help me out here perhaps?

 

Link to comment

You'd have a pretty hard time doing that with bash and curl.

 

It'd be much easier using Python or Ruby and the mechanize library. That can fill out and submit forms and store cookies for you. If the site relies on JavaScript, however, you'll need something much more heavyweight, like Selenium.

Link to comment

Pardon my ignorance, but when I try to install mechanize with:

gem install mechanize

all I get is:

Fetching: unf_ext-0.0.6.gem (100%)
ERROR:  While executing gem ... (Gem::FilePermissionError)
    You don't have write permissions for the /Library/Ruby/Gems/2.0.0 directory.

How do I correct this?

Link to comment

If you don’t use something to manage ruby versions (which you should, depending on your needs and experience) which seems to be the case, you need to prepend your command with sudo, as in sudo gem install mechanize. That will install it system-wide, which while not ideal, might be easier for you to manage.

Link to comment

I had read somewhere that using sudo was not a good idea ... but, I did it anyway, and now when I try to run a ruby script I get this:

/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:55:in `require': cannot load such file -- mechanize (LoadError)
from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /Users/Rodger/Desktop/new.rb:1:in `<main>'

BTW: the script is:

require 'mechanize'
mechanize = Mechanize.new
page = mechanize.get('http://stackoverflow.com/')
puts page.title
Link to comment
  • 3 weeks later...

People warn on sudo for a few reasons. Basically, you're altering your system as a superuser. Since you're doing that, you can break just about anything. Yes, sudo will let you break (irreparably) anything on your system... that's because it overrides any sort of protection that you have.

 

For the gems, it's up to you. It probably won't break anything, but the idea is that your system has a set of libraries (basically, gems) installed, and you're adding these system-wide. If you're doing this on your own computer, then, in practice, it's probably fine (but you do need to remember that you installed it and that other people haven't installed it if you decide to distribute it).

 

But the script above works just fine for me after I installed the `mechanize` gem. Right now, it looks like you haven't installed it. (the line require 'mechanize' looks for a gem named mechanize in the load paths, but your system isn't finding it).

Link to comment

If your workflow uses any non-standard libraries, it's usually a better idea to include them with the workflow instead of installing them system-wide. This is the common practice with programs (not libraries) written in Ruby, Python etc. I believe bundler is the Ruby weapon of choice.

 

There are several advantages to including dependencies with software instead of installing system-wide:

 

  • Python and Ruby don't understand versioned libraries. You can install one version only, and with a system-wide install, you're screwed if two programs require two incompatible versions of the same library.
  • It makes distributing workflows (and other software) much easier. It just works out of the box.
  • A system upgrade/reinstall won't break everything. If your libraries are installed in the system, it's pot luck whether they'll still be there after an upgrade.
  • So much easier for users to install (if you distribute your software), and others to hack on (if it's open source). You really don't want to be responsible for updating a library and breaking a user's favourite program. Especially if the user is technically inept.
  • Much easier when you release a new version. It still works out of the box. No need for users to break out Terminal and update a bunch of libraries when they install a new version.

 

If your workflow requires so many libraries it's too big to distribute easily (>10MB), you could include a script that will install them (locally in the workflow) when the workflow is first run.

Edited by deanishe
Link to comment

Dean is right about everything he says above (i.e. bundler, etc...).

 

The reason why it's hard to distribute workflows >10MB is because Github doesn't accept unrecognized filetypes >10MB (Packal uses GH as a backend, so it rejects everything over that). While a .alfredworkflow is actually a zipfile, it's technically "unrecognized."

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...