Jump to content
RodgerWW

Grab data from Website [Secure login required]

Recommended Posts

OK, I would LIKE to be able to make a workflow to grab my current data useage from my Cable provider's account page.

 

The initial LOGIN page is here: https://myaccount.cogeco.ca/acpub/login/

 

Now, they are using some sort of authentication to protect users, and I am VERY limited in my knowledge of curl in BASH.

 

As far as I can tell, I need to use curl twice, first to authenticate to one URL, then to login with another url ... perhaps using cookies?

 

Can someone help me out here perhaps?

 

Share this post


Link to post

You'd have a pretty hard time doing that with bash and curl.

 

It'd be much easier using Python or Ruby and the mechanize library. That can fill out and submit forms and store cookies for you. If the site relies on JavaScript, however, you'll need something much more heavyweight, like Selenium.

Share this post


Link to post

Expanding on deanishe’s answer, for ruby I’d also suggest watir-webdriver: it’s easy to pick up even without any working ruby knowledge, in great part thanks to it’s simplicity and both good and short examples on the website.

Share this post


Link to post

Pardon my ignorance, but when I try to install mechanize with:

gem install mechanize

all I get is:

Fetching: unf_ext-0.0.6.gem (100%)
ERROR:  While executing gem ... (Gem::FilePermissionError)
    You don't have write permissions for the /Library/Ruby/Gems/2.0.0 directory.

How do I correct this?

Share this post


Link to post

If you don’t use something to manage ruby versions (which you should, depending on your needs and experience) which seems to be the case, you need to prepend your command with sudo, as in sudo gem install mechanize. That will install it system-wide, which while not ideal, might be easier for you to manage.

Share this post


Link to post

I had read somewhere that using sudo was not a good idea ... but, I did it anyway, and now when I try to run a ruby script I get this:

/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:55:in `require': cannot load such file -- mechanize (LoadError)
from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /Users/Rodger/Desktop/new.rb:1:in `<main>'

BTW: the script is:

require 'mechanize'
mechanize = Mechanize.new
page = mechanize.get('http://stackoverflow.com/')
puts page.title

Share this post


Link to post

People warn on sudo for a few reasons. Basically, you're altering your system as a superuser. Since you're doing that, you can break just about anything. Yes, sudo will let you break (irreparably) anything on your system... that's because it overrides any sort of protection that you have.

 

For the gems, it's up to you. It probably won't break anything, but the idea is that your system has a set of libraries (basically, gems) installed, and you're adding these system-wide. If you're doing this on your own computer, then, in practice, it's probably fine (but you do need to remember that you installed it and that other people haven't installed it if you decide to distribute it).

 

But the script above works just fine for me after I installed the `mechanize` gem. Right now, it looks like you haven't installed it. (the line require 'mechanize' looks for a gem named mechanize in the load paths, but your system isn't finding it).

Share this post


Link to post

If your workflow uses any non-standard libraries, it's usually a better idea to include them with the workflow instead of installing them system-wide. This is the common practice with programs (not libraries) written in Ruby, Python etc. I believe bundler is the Ruby weapon of choice.

 

There are several advantages to including dependencies with software instead of installing system-wide:

 

  • Python and Ruby don't understand versioned libraries. You can install one version only, and with a system-wide install, you're screwed if two programs require two incompatible versions of the same library.
  • It makes distributing workflows (and other software) much easier. It just works out of the box.
  • A system upgrade/reinstall won't break everything. If your libraries are installed in the system, it's pot luck whether they'll still be there after an upgrade.
  • So much easier for users to install (if you distribute your software), and others to hack on (if it's open source). You really don't want to be responsible for updating a library and breaking a user's favourite program. Especially if the user is technically inept.
  • Much easier when you release a new version. It still works out of the box. No need for users to break out Terminal and update a bunch of libraries when they install a new version.

 

If your workflow requires so many libraries it's too big to distribute easily (>10MB), you could include a script that will install them (locally in the workflow) when the workflow is first run.

Edited by deanishe

Share this post


Link to post

Dean is right about everything he says above (i.e. bundler, etc...).

 

The reason why it's hard to distribute workflows >10MB is because Github doesn't accept unrecognized filetypes >10MB (Packal uses GH as a backend, so it rejects everything over that). While a .alfredworkflow is actually a zipfile, it's technically "unrecognized."

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...