Ajax

Does Google execute JavaScript?

Yet another reason to not assume your JavaScript will always run, or if it does run, that it will run in its entirety:

I’m told: Yes, it’s 2016; of course Google executes JavaScript.

But I’m also told: Server-side rendering is necessary for SEO.

If Google can run JavaScript and thereby render client-side views, why is server-side rendering necessary for SEO? Okay, Google isn’t the only search engine, but it’s obviously an important one to optimize for.

Recently I ran a simple experiment to see to what extent the Google crawler understands dynamic content. I set up a web page at doesgoogleexecutejavascript.com which does the following:

  1. The HTML from the server contains text which says “Google does not execute JavaScript.”
  2. There is some inline JavaScript on the page that changes the text to “Google executes JavaScript, but only if it is embedded in the document.”
  3. The HTML also links to a script which, when loaded, changes the text to “Google executes JavaScript, even if the script is fetched from the network. However, Google does not make AJAX requests.”
  4. That script makes an AJAX request and updates the text with the response from the server. The server returns the message “Google executes JavaScript and even makes AJAX requests.”

After I launched this page, I linked to it from its GitHub repository and waited for Google to discover it.

[…]

It seems Google is not guaranteed to run your JavaScript automatically. You may have to manually trigger a crawl. And, even then, Google apparently won’t do any AJAX requests your page may depend on, or at least it didn’t in my case.

[…]

My conclusion is: Google may or may not decide to run your JavaScript, and you don’t want your business to depend on its particular inclination of the day. Do server-side/universal/isomorphic rendering just to be safe.

Don't go single-page-app too soon, or how GitHub reimplementing navigation in JavaScript loses streaming capability

A few weeks ago I was at Heathrow airport getting a bit of work done before a flight, and I noticed something odd about the performance of GitHub: It was quicker to open links in a new window than simply click them.

[…]

When you load a page, the browser takes a network stream and pipes it to the HTML parser, and the HTML parser is piped to the document. This means the page can render progressively as it’s downloading. The page may be 100k, but it can render useful content after only 20k is received.

This is a great, ancient browser feature, but as developers we often engineer it away. Most load-time performance advice boils down to “show them what you got” - don’t hold back, don’t wait until you have everything before showing the user anything.

GitHub cares about performance so they server-render their pages. However, when navigating within the same tab navigation is entirely reimplemented using JavaScript. Something like…

Code language: JavaScript

// …lots of code to reimplement browser navigation…
const response = await fetch('page-data.inc');
const html = await response.text();
document.querySelector('.content').innerHTML = html;
// …loads more code to reimplement browser navigation…

This breaks the rule, as all of page-data.inc is downloaded before anything is done with it. The server-rendered version doesn’t hoard content this way, it streams, making it faster. For GitHub’s client-side render, a lot of JavaScript was written to make this slow.

I’m just using GitHub as an example here - this anti-pattern is used by almost every single-page-app.

Switching content in the page can have some benefits, especially if you have some heavy scripts, as you can update content without re-evaluating all that JS. But can we do that without losing streaming?

[…]

Newline-delimited JSON

A lot of sites deliver their dynamic updates as JSON. Unfortunately JSON isn’t a streaming-friendly format. There are streaming JSON parsers out there, but they aren’t easy to use.

So instead of delivering a chunk of JSON:

Code language: JavaScript

{
  "Comments": [
    {"author": "Alex", "body": "…"},
    {"author": "Jake", "body": "…"}
  ]
}

…deliver each JSON object on a new line:

Code language: JavaScript

{"author": "Alex", "body": "…"}
{"author": "Jake", "body": "…"}

This is called “newline-delimited JSON” and there’s a sort-of standard for it. Writing a parser for the above is much simpler. In 2017 we’ll be able to express this as a series of composable transform streams:

Code language: JavaScript

const response = await fetch('comments.ndjson');
const comments = response.body
  // From bytes to text:
  .pipeThrough(new TextDecoder())
  // Buffer until newlines:
  .pipeThrough(splitStream('\n'))
  // Parse chunks as JSON:
  .pipeThrough(parseJSON());
 
for await (const comment of comments) {
  // Process each comment and add it to the page:
  // (via whatever template or VDOM you're using)
  addCommentToPage(comment);
}

…where splitStream and parseJSON are reusable transform streams. But in the meantime, for maximum browser compatibility we can hack it on top of XHR.

Again, I’ve built a little demo where you can compare the two, here are the 3g results:

Image

Versus normal JSON, ND-JSON gets content on screen 1.5 seconds sooner, although it isn’t quite as fast as the iframe solution. It has to wait for a complete JSON object before it can create elements, you may run into a lack-of-streaming if your JSON objects are huge.

Don’t go single-page-app too soon

As I mentioned above, GitHub wrote a lot of code to create this performance problem. Reimplementing navigations on the client is hard, and if you’re changing large parts of the page it might not be worth it.

[…]

[A] simple no-JavaScript browser navigation to a server rendered page is roughly as fast. The test page is really simple aside from the comments list, your mileage may vary if you have a lot of complex content repeated between pages (basically, I mean horrible ad scripts), but always test! You might be writing a lot of code for very little benefit, or even making it slower.