web - Why I can log in amazon website using python mechanize, but not requests or urllib2 -
i can use following piece of python code found here log amazon.com:
import mechanize br = mechanize.browser() br.set_handle_robots(false) br.addheaders = [("user-agent", "mozilla/5.0 (x11; u; linux i686; en-us; rv:1.9.2.13) gecko/20101206 ubuntu/10.10 (maverick) firefox/3.6.13")] sign_in = br.open('https://www.amazon.com/gp/sign-in.html') br.select_form(name="sign-in") br["email"] = 'test@test.com' br["password"] = 'test4test' logged_in = br.submit() orders_html = br.open("https://www.amazon.com/gp/css/history/orders/view.html?orderfilter=year-%s&startatindex=1000" % 2013)
but following 2 pieces using requests module , urllib2 not work.
import requests import sys username = "test@test.com" password = "test4test" login_data = ({ 'email' : fb_username, 'password' : fb_password, 'flex_password': 'true'}) url = 'https://www.amazon.com/gp/sign-in.html' agent ={'user-agent', 'mozilla/5.0 (macintosh; intel mac os x 10_7_4) applewebkit/537.1 (khtml, gecko) chrome/21.0.1180.57 safari/537.1'} session = requests.session(config={'verbose': sys.stderr}, headers = agent) r = session.get('http://www.amazon.com') r1 = session.post(url, data=login_data, cookies=r.cookies) r2 = session.post("https://www.amazon.com/gp/css/history/orders/view.html?orderfilter=year-2013&startatindex=1000", cookies = r1.cookies)
#
import urllib2 import urllib import cookielib amazon_username = "test@test.com" amazon_password = "test4test" url = 'https://www.amazon.com/gp/sign-in.html' cookie = cookielib.cookiejar() login_data = urllib.urlencode({'email' : amazon_username, 'password' : amazon_password,}) opener = urllib2.build_opener(urllib2.httpcookieprocessor(cookie)) opener.addheaders = [('user-agent', 'mozilla/5.0 (macintosh; intel mac os x 10_7_4) applewebkit/537.1 (khtml, gecko) chrome/21.0.1180.57 safari/537.1')] opener.open('www.amazon.com') response = opener.open(url, login_data) response = opener.open("https://www.amazon.com/gp/css/history/orders/view.html?orderfilter=year-%s&startatindex=1000" % 2013, login_data)
what did wrong in posting amazon log in form? first time post form. appreciated.
i prefer use urllib2 or requests because other code using these 2 modules.
moreover, can body comment on speed performance between mechanize, requests , urllib2, , other advantage of mechanize on other two?
~~~~~~~~~~~new~~~~~~~~~~~~ following c.c.'s instruction, can log in urllib2. when try same requests, still not work. can give me clue?
import requests import sys fb_username = "test@test.com" fb_password = "xxxx" login_data = ({ 'email' : fb_username, 'password' : fb_password, 'action': 'sign-in'}) url = 'https://www.amazon.com/gp/sign-in.html' agent ={'user-agent', 'mozilla/5.0 (macintosh; intel mac os x 10_7_4) applewebkit/537.1 (khtml, gecko) chrome/21.0.1180.57 safari/537.1'} session = requests.session(config={'verbose': sys.stderr}, headers = agent) r = session.get(url) r1 = session.post('https://www.amazon.com/gp/flex/sign-in/select.html', data=login_data, cookies=r.cookies) b = r1.text
regarding urllib2
approach, missing 2 things.
first, if @ source of sign-in.html
, shows
<form name="sign-in" id="sign-in" action="/gp/flex/sign-in/select.html" method="post">
meaning form should submitted select.html
.
second, besides email & password, need select whether existing user or not:
<input id="newcust" type="radio" name="action" value="new-user"...> ... <input id="returningcust" type="radio" name="action" value="sign-in"...>
it should this:
import cookielib import urllib import urllib2 amazon_username = ... amazon_password = ... login_data = urllib.urlencode({'action': 'sign-in', 'email': amazon_username, 'password': amazon_password, }) cookie = cookielib.cookiejar() opener = urllib2.build_opener(urllib2.httpcookieprocessor(cookie)) opener.addheaders = [('user-agent', ...)] response = opener.open('https://www.amazon.com/gp/sign-in.html') print(response.getcode()) response = opener.open('https://www.amazon.com/gp/flex/sign-in/select.html', login_data) print(response.getcode()) response = opener.open("https://www.amazon.com/") # should show logged in print(response.getcode())
Comments
Post a Comment