A curl-ier Curb – better cookie support in Curb

In: Open Source|Ruby

18 Jun 2009

As of Curb 0.3.7, Curb comes with slightly better cookie support that makes it more “curl-like”.

It probably sounds like shameless self-promotion that I’m blogging about these changes since I’d contributed them. Well, yeah that’s true, but I’m also doing so because I doubt these changes will ever be made known since Curb doesn’t publicize any updates or changelogs.

Curb is my current number 1 HTTP client for Ruby so the more love it gets, the better.

Passing cookies as a string in Curb requests

Curl veterans will probably know how to do this with the curl binary:

curl -b "auth=abcdef; ASP.NET_SessionId=lotsatext;" example.com

This sends a request to example.com with 2 cookies named “auth” and “ASP.NET_SessionId” (I hate those big-ass ASP.NET cookies btw). There wasn’t a way to set this in Curb, so I looked up the libcurl C API docs and replicated the same option in Curb (commit on Github). An example:

curl = Curl::Easy.new('http://example.com/')
curl.cookies = 'auth=abcdef; ASP.NET_SessionId=big-wall-of-text;'
curl.perform

Of course, the cookies will more often be retrieved/constructed rather than a literal like in the example above. In my case, I was proxying cookies while trying to wrap an API around a site that doesn’t have one.

Passing cookies as a file via the “cookiefile” option

The second change is the new cookiefile option. This replicates curl commands like these:

curl -b cookies-to-send.txt www.example.com

with something like this in ruby:

curl = Curl::Easy.new('http://example.com/')
curl.cookiefile = '/path/to/cookies-to-send.txt'
curl.perform

The cookies file looks like this (you can get a sample with the --cookie-jar option to curl, e.g. curl --cookie-jar cookies.txt www.wego.com):

www.wego.com	FALSE	/	FALSE	0	lang	
www.wego.com	FALSE	/	FALSE	0	user_country_code	SG

Check out the commit on Github if you’re interested.

Why would I use these?

Now you might be wondering how these 2 changes are useful – well, they are totally irrelevant to you if you’re not expecting any cookie support in Curb. However, if you’re accessing or scraping a site that uses cookie-based authentication, these changes allow you to keep your Curb client authenticated across sessions, even when doing HTTP POSTs (Curb doesn’t send cookies properly in POST requests even if curl.enable_cookies is set).

I’ve found these changes to Curb particularly useful since I vastly prefer Curb to most HTTP clients for its speed and lightweight implementation, and heavier scraper-type HTTP clients like Mechanize are a last resort.

6 Responses to A curl-ier Curb – better cookie support in Curb

Avatar

Waseem

August 17th, 2009 at 10pm

That is awesome change to curb. Would you please supply some example as how to use this feature of curb? How to authenticate in a cookie-based authentication system?

Avatar

Chu Yeow

August 18th, 2009 at 9am

@Waseem here’s an example (I actually use something like this in an app):

curl = Curl::Easy.new('http://example.com/login')

# Extract cookies in response.
cookies = []
curl.on_header { |header|

  # Parse cookies from the headers (yes, this is a naive implementation but it's fast).
  cookies << "#{$1}=#{$2}" if header =~ /^Set-Cookie: ([^=])=([^;]+;)/

  header.length
}

# POST to login.
curl.http_post(
  Curl::PostField.content('username', 'foo'),
  Curl::PostField.content('password', 'bar')
)

# Reset the on_header handler.
curl.on_header

# Now you can use the auth cookies in future requests.
curl = Curl::Easy.new('http://example.com/private/page')
curl.cookies = cookies
curl.perform

Avatar

Thufir

February 1st, 2010 at 9am

how do you create a cookie as:

You then have to add yourself a cookie (well, it look like Google doesn’t add it itself) with the following properties :
name SID
domain .google.com
path /
expires 1600000000

http://code.google.com/p/pyrfeed/wiki/GoogleReaderAPI

in your example, only the name is set.

Avatar

Anthony Ettinger

May 16th, 2010 at 4pm

I had this problem earlier with curl…but how do I authenticate (save to cookiejar) and then re-use that cookiejar for subsequent requests?

It seems like the cookiejar isn’t re-used from w/in the same script (2 curb calls) except on the next execute of the script…then it works.

Avatar

Anthony Ettinger

May 16th, 2010 at 4pm

I just found this explanation — the cookiejar is NOT written until curl exits — or curl_easy_cleanup is called. I do not see a method to do that in Curb:

http://curl.haxx.se/libcurl/c/curl_easy_cleanup.html

I tried creating a new instance of Curl::Easy, but the script still doesn’t write the file until the curl lib is done.

Avatar

Sergi

October 28th, 2010 at 1am

I’ve found a solution to pass cookies from one page to other. Simply after doing the first perform, the next petition should be done with Curl::Easy.perform(2nd URL), no Curl::Easy.new

i.e:

curl = Curl::Easy.new(’1st URL’)
curl.enable_cookies = true
curl.cookiefile = ‘/tmp/cookie.txt’
curl.cookiejar = ‘/tmp/cookie.txt’
curl.http_post(post data)
curl.perform

curl = Curl::Easy.perform(’2nd URL’)

Comment Form