Swizec Teller - a geek with a hatswizec.com

Senior Mindset Book

Get promoted, earn a bigger salary, work for top companies

Senior Engineer Mindset cover
Learn more

    How I reverse-engineered Hacker News

    Hacker News has a public API, and it's great. Live frontpage updates, live upvote counts, live comments. Everything you want or need at the tip of your fingers in real-time.

    But there's no write. You can't login, you can't upvote, and you can't post. You can't do anything but read. And you need writes to make a Hacker News App.

    photos images original 000 415 209 3b4

    Without a public write API, the only thing left to do was to reverse engineer HackerNews forms and fake them with fetch() requests. Technically, this falls under cross site request forgery, but it isn't malicious.

    Sure, we're pretending that we're a browser that's submitting forms, but we do what the user expects. That makes it okay, I think. HackerNews allows it at least. 1 shrug

    So how do you approach this sort of problem?

    It happens in 3 steps:

    1. Do the action
    2. Inspect the request
    3. Fake the request

    The Unofficial HN write API wrapper lives as a Gist. I'm tempted to release it as an NPM module. Should I?

    I'm going to show you how this works on two examples: A Login form and an Upvote button. The first is a normal POST request; the second is a GET request with some fakery protection.

    Do the action

    HackerNews's login form looks like this:

    Which in HTML looks like this:

    <form method="post" action="login">
      <input type="hidden" name="goto" value="news" />
      <table border="0">
        <tbody>
          <tr>
            <td>username:</td>
            <td>
              <input
                type="text"
                name="acct"
                size="20"
                autocorrect="off"
                spellcheck="false"
                autocapitalize="off"
                autofocus
                autocomplete="off"
              />
            </td>
          </tr>
          <tr>
            <td>password:</td>
            <td>
              <input type="password" name="pw" size="20" autocomplete="off" />
            </td>
          </tr>
        </tbody>
      </table>
      <br />
      <input type="submit" value="login" />
    </form>
    

    Looking at this HTML tells us HackerNews uses a POST request to to the /login URL to log you in, then it sends a goto field2, and two text fields; acct and pw. Whoever wrote this code doesn't believe in typing things out.


    The Upvote action isn't a form; it's a link that looks like this:

    <a id="up_14918911" onclick="return vote(event, this, "up")" href="vote?id=14918911&how=up&auth=0855997b736abc0c65600fee44fbf9bbf3bca65c&goto=news"><div class="votearrow" title="upvote"></div></a>
    

    This tells us that voting happens through JavaScript. We can look for the vote() function in HackerNews's javascript code. Lucky for us, it is neither obfuscated nor minified. This makes it easy to read right in Chrome DevTools.

    The vote() function looks like this:

    function vote(ev, el, how) {
      var id = el.id.split(/_/)[1];
      var up = $('up_' + id);
      vis(up, how == 'un');
      vis($('down_' + id), how == 'un');
      var unv = '';
      if (how != 'un') {
        unv = " | <a id="un_" + id
          + "" onclick="return vote(event, this,\"un\")" href=""
          + up.href.replace(" how="up','how=un')" +="" "'="">" + (how == 'up' ? 'unvote' : 'undown') + "</a>"
      }
      $('unv_' + id).innerHTML = unv;
      new Image().src = el.href;
      ev.stopPropagation();
      return false;
    }
    

    Most of this code deals with updating the UI. The interesting line is new Image().src = el.href. It tells us that HackerNews upvotes use a GET request to the URL set in the link.

    In this case, that link is /vote?id=14918911&how=up&auth=0855997b736abc0c65600fee44fbf9bbf3bca65c&goto=news.

    The auth argument looks tricky. It's some sort of authentication token making sure these requests don't get forged. Look at other examples on the page, and you'll see the auth token is different on every upvote button.

    That means it isn't a per-user token, it's a per-user-per-story token. Which means we're going to have to find the URL every time we want to upvote anything.

    Inspect the request

    We can confirm our findings by making the actions and inspecting their requests in Chrome DevTools.

    Due to redirects, you have to enable Preserve Log. Somehow I had never noticed that feature existed before doing this.

    As we assumed, the login request sends goto, acct, and pw, and sets a cookie.


    Inspecting the upvote request follows a similar process. This is what we find:

    No surprises here. We have to send a GET request to the upvote URL that's unique for every upvote button. We also send our login cookie so we can assume the server is doing some sort of validation between who's logged in and what the auth token says.

    Fake the request

    To fake these requests, we use fetch(). Login is simple; it looks like this:

    class HN {
      BaseURL = "https://news.ycombinator.com";
    
      login(username, password) {
        let headers = new Headers({
          "Content-Type": "application/x-www-form-urlencoded",
          "Access-Control-Allow-Origin": "*",
        });
    
        return fetch(`${this.BaseURL}/login`, {
          method: "POST",
          headers: headers,
          body: convertRequestBodyToFormUrlEncoded({
            acct: username,
            pw: password,
            goto: "news",
          }),
          mode: "no-cors",
          credentials: "include",
        })
          .then((res) => res.text())
          .then((body) => {
            if (body.match(/Bad Login/i)) {
              return false;
            } else {
              return true;
            }
          });
      }
    }
    

    We set a Content-Type header to application/x-www-form-urlencoded. Many servers can handle params in any shape they come, but HackerNews is not one of those. This is the only type that works.

    Our fetch() request uses the POST method and our headers, converts our {acct, pw, goto} data object into a URL encoded string, asks the server (or is it the browser?) to please ignore CORS, and we set credentials: 'include' to automagically handle cookies. That's how you expect client-side requests to behave, but you have to explicitly ask for it with fetch().

    HackerNews responds with either the login page or the news page. If login was successful, we get the news page. Unsuccessful, and we get the login page with a Bad Login text.

    We look for said Bad Login text to decide if we are now logged in. Hopefully there are no frontpage stories with Bad Login in their title :D


    Faking an upvote request is more tricky. We have to first fetch the URL.

    To do that we use CheerioJs, an HTML parser that works in JavaScript and gives us a jQuery-like interface to the document. It's designed to work in node, but you can use the cheerio-without-node in the browser or React Native.

    Each HackerNews story can have its own page. This makes our job easier. We can request that page, parse it, and find the upvote button. That looks like this:

    class HN {
      // ..
      getUpvoteURL(id) {
        return fetch(`${this.BaseURL}/item?id=${id}`, {
          mode: "no-cors",
          credentials: "include",
        })
          .then((res) => res.text())
          .then((body) => {
            const doc = cheerio.load(body);
    
            return doc(`#up_${id}`).attr("href");
          });
      }
    }
    

    We're making a basic GET request, asking to avoid CORS, and including our credentials. That way HackerNews thinks we're a browser with a logged in user and gives us exactly what it would give to a normal user.

    We use cheerio.load to parse the HTML and look for the upvote URL with doc(#up_${id}). Just like we would with jQuery.

    You can find the correct ID in the upvote button's HTML. Here it is again:

    <a id="up_14918911" onclick="return vote(event, this, "up")" href="vote?id=14918911&how=up&auth=0855997b736abc0c65600fee44fbf9bbf3bca65c&goto=news"><div class="votearrow" title="upvote"></div></a>
    

    This part: <a id="up_...". It's unique for every upvote link and helps us find the right one.

    To use this link, we issue a fetch() request similar to the one we used to find it.

    class HN {
      // ...
      upvote(id) {
        return this.getUpvoteURL(id)
          .then((url) =>
            fetch(`${this.BaseURL}/${url}`, {
              mode: "no-cors",
              credentials: "include",
            })
          )
          .then((res) => res.text())
          .then((body) => {
            return true;
          })
          .catch((error) => {
            console.log(error);
            return false;
          });
      }
    }
    

    Get the URL, make a fetch, include credentials, assume it worked. HackerNews doesn't give us any real indication that it worked or not. We have to assume.

    I guess we could re-fetch the item's page and see if that upvote button is still there. 🤔 Sending 2 requests for every upvote feels bad enough, though. No need to make it 3.

    Your new superpower

    You can use this approach to reverse engineer any website. If it works on HackerNews, it's going to work everywhere.

    Apps and websites fundamentally cannot guard against something like this. If legitimate users can use the site, then you can pretend to be a real user and fake those requests. It's just how it is.

    HackerNews makes it easy. For others, it might be trickier.

    Use your new superpower responsibly.

    1. There are ways to prevent cross-site requests with CORS policies, domain filtering, and stuff like that.
    2. All HN actions have a goto field or argument, I assume it tells the server where to redirect after the request.
    Published on August 3rd, 2017 in Learning, Technical

    Did you enjoy this article?

    Continue reading about How I reverse-engineered Hacker News

    Semantically similar articles hand-picked by GPT-4

    Senior Mindset Book

    Get promoted, earn a bigger salary, work for top companies

    Learn more

    Have a burning question that you think I can answer? Hit me up on twitter and I'll do my best.

    Who am I and who do I help? I'm Swizec Teller and I turn coders into engineers with "Raw and honest from the heart!" writing. No bullshit. Real insights into the career and skills of a modern software engineer.

    Want to become a true senior engineer? Take ownership, have autonomy, and be a force multiplier on your team. The Senior Engineer Mindset ebook can help 👉 swizec.com/senior-mindset. These are the shifts in mindset that unlocked my career.

    Curious about Serverless and the modern backend? Check out Serverless Handbook, for frontend engineers 👉 ServerlessHandbook.dev

    Want to Stop copy pasting D3 examples and create data visualizations of your own? Learn how to build scalable dataviz React components your whole team can understand with React for Data Visualization

    Want to get my best emails on JavaScript, React, Serverless, Fullstack Web, or Indie Hacking? Check out swizec.com/collections

    Did someone amazing share this letter with you? Wonderful! You can sign up for my weekly letters for software engineers on their path to greatness, here: swizec.com/blog

    Want to brush up on your modern JavaScript syntax? Check out my interactive cheatsheet: es6cheatsheet.com

    By the way, just in case no one has told you it yet today: I love and appreciate you for who you are ❤️

    Created by Swizec with ❤️