How to log into Wordpress with Python, BeautifulSoup, and httpx

The key to logging into wordpress programmatically, at least the one I was working on, seemed to be havng a cookie that gets set on first visit to the page, along with the nonce (This is likely not always required. I see it as being set by woocommerce) set as a hidden value in the login form.

This may be standard behavior or it may be a plugin called Snow Monkey Forms (I see a cookie set by that).

I have it working and don't need to dig in too much to specifics, but quick reading suggested that wordpress sets a test cookie to make sure cookies can be set, and then will only process the login if that cookie is there.

Since I need to visit the login form page to get the nonce anyway, I just make sure to use the cookies set with that visit, when submitting the form. It may be that I could have just preset a cookie and submitted that, but I have to visit the page anyway so there is no benefit.

I didn't want to login for every page load, so I save the cookies, and only refresh them once a day, I probably could have left them longer if I wanted to check expiration or confirm that the cookies were working, and only grab them again when they didn't work anymore, but hitting the server to renew the cookies once a day is gentle enough for my needs.

Here is the login function I use. I'd love feedback on how to polish it:

def login(base_url: str, login_path: str) -> httpx.Cookies:
    cookie_file = 'cookiejar'
    cookiejar = LWPCookieJar(filename=cookie_file)
    if os.path.isfile(cookie_file) and not is_file_older_than_x_days(cookie_file):
        cookiejar.load(ignore_discard=True)
        if len(cookiejar):
            cookies = httpx.Cookies(cookiejar)
    else:
        client = httpx.Client()
        r = client.get(base_url + login_path)
        source = r.text
        soup = BeautifulSoup(source, 'html.parser')
        form = soup.select('form.login')[0]
        nonce = form.select('#woocommerce-login-nonce')[0].get('value')
        payload = Login(
            username=os.getenv('USERNAME'),
            password=os.getenv('PASSWORD'),
            nonce=nonce
        )
        payload = payload.dict()
        payload['woocommerce-login-nonce'] = payload.pop('nonce')
        payload['_wp_http_referer'] = login_path
        r = client.post(base_url + login_path, data=payload, cookies=client.cookies)
        r.raise_for_status()
        cookies = client.cookies
        for cookie in cookies.jar:
            cookiejar.set_cookie(cookie)
        cookiejar.save(ignore_discard=True)
    return cookies

def is_file_older_than_x_days(file, days=1):
    file_time = os.path.getmtime(file)
    return (time.time() - file_time) / 3600 > 24 * days

Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.