A history of frameworks

April 25, 2018

Suppose you wanted to build a web scraping API that would present the structured HTML data of some web page as JSON for other sites to consume. Certain aspects of the structure of site are important when designing a scraper like this. For example, it could be helpful for the site to use human-readable (or at least meaningful) query parameters. It would be helpful if the meaningful data on the webpage was in a predictable place. The HTML file should be more than just this:

    <script src="./script.js"></script>

The point here is that web crawlers and scrapers are often limited to the data that is statically available to them on the page. Without running the script to completion, the crawler cannot determine what data is on the page. A program cannot determine in advance whether or not the script will run to completion at all (a case of the halting problem). As a result, indexing the Web and scraping from sites is more difficult, and sometimes even impossible. Furthermore, allowing arbitrary mobile code to be run on client machines creates a number of security holes that do not exist with plain HTML webpages. Since HTML is a declarative language, it in itself does not introduce security holes.

The rise of JavaScript frameworks has made Web even more complex than Tim Berners-Lee had perhaps ever dreamt of. Though Web programmers have a powerful toolbox in JavaScript, there are also a number of disadvantages. In this paper, we will discuss some of the important trade-offs that JavaScript frameworks have introduced. We will also observe how JavaScript aligns with the original vision of the Web.

History of JavaScript

The first version of JavaScript was famously (or perhaps infamously) created at Netscape by Brendan Eich in the span of ten days (1). The rushed development of the language and the immediate adoption meant that the poor design decisions made in those ten days still have lasting effects to the present. In many ways, the language did not plan for evolution. JavaScript was originally designed so that Web pages could behave dynamically, and users could interact with the content in front of them. Originally called Mocha, it was designed to be easy to use for small-scale scripts on the Netscape browser. Several design decisions made JavaScript accessible for early adopters. Firstly, simplicity supported the scalability of the language. The Java-like syntax gave first time users an immediate sense of familiarity. Furthermore, the use of functions as first-class objects provides a powerful construct for event-driven programming.

At first, the language was used for its original purpose: to provide simple dynamic content to users easily. Small-scale animations, games, and basic user interactions were built using JavaScript. Plain JavaScript code was often messy and difficult to parse in the early days, especially when standards did not exist, leading to buggy webpages. As developers became more experienced with the language, the ECMAScript standard was developed, to establish uniformity across implementations of the language as well as best practices for engineers.

JavaScript Frameworks

History

One of the earliest JavaScript frameworks is one that is still widely used today: jQuery. The spirit of jQuery was in line with that of JavaScript. The library simplified the manipulation of HTML tags. Combined with the new technology of AJAX in the early 2000s (Asynchronous JavaScript and XML), jQuery was a powerful new tool in the hands of developers who wanted to work on cross-browser applications. AJAX allowed Web pages to interact with the server outside of the page load time. jQuery was built by John Resig with two principles in mind: simplicity and compatibility (2).

On the other hand, the event-driven nature of jQuery led to buggy Web pages for many developers. The notion of a “single source of truth” could become lost in thousands of lines of jQuery, since components within pages were often related in ways that were difficult to keep track of. On sites where there may have been hundreds or thousands of interactive elements, jQuery codebases grew tremendously. Furthermore, due to (relatively) slow network speeds during the early years of jQuery, the 30kb library increased page load time significantly. Nevertheless, thanks to its cross-browser compatibility and simple syntax, jQuery became vastly popular, at one point a part of almost 90% of all Web sites (3).

After the introduction of jQuery, a trend began to change the Web into what we see today. New websites that developers wanted to create could not be done with jQuery alone. Faster networks meant that developers did not need to be concerned about the bloat that JavaScript libraries or frameworks added. Developers begin writing single page applications, which would require a more structured approach in the use of JavaScript. The model-view-controller design philosophy was introduced, in contrast to the model-view philosophy of HTML and CSS. While HTML and CSS would handle the content and the presentation respectively, JavaScript could handle interactions, AJAX requests, and dynamic content. New frameworks were built to automatically synchronize data between the controller and the model. Though single page applications may not have been in Tim’s original vision, there is no doubt that these applications have significantly changed the Web. Frameworks like AngularJS, Backbone, React, and Ember have tied JavaScript intimately with the DOM. Live data-rendering and interactive Web applications were made possible with these frameworks. At the same time, there were significant disadvantages to using frameworks, including visibility, security, and turnover time.

Issue: Search Engine Optimization and Crawlers

Among the biggest of these issues is search engine optimization. Some frameworks allowed Web pages to render on the client side, purely from JavaScript. This means that the served HTML body could be as little as a single script tag. As we discussed above, search engines like Google have trouble indexing pages that are dynamically generated. The static nature of HTML makes it simple to follow links as the Google crawler does. In the early days of JavaScript frameworks, Google discouraged their use (4). Google described the trade off in building single page AJAX based applications: “making your application more responsive has come at a huge cost: crawlers are not able to see any content that is created dynamically” (4). This post was from 2009.

Of course, Google could not stand still while JavaScript apps began to take over the Web. Google, in a sense, applied Postel’s law in their recommendation. In 2009, they were conservative with what sort of content they claimed to be able to index. Google used its leverage on the Web to govern the way that developers built their sites. At the time, Google asked for snapshots of dynamically generated sites so that their crawlers could parse the data. Even so, Google still worked to index AJAX sites and single page applications. They began rendering the JavaScript-generated pages on crawlers as brosers would (5). Google needed to stay ahead of the Web developers in order to keep their search ahead of the trend. They disincentivized building AJAX applications so that they could get a head start on indexing those sites.

Google would finally come out in 2015 to deprecate that recommendation (6). Google can now index most JavaScript pages, crawling them as a modern browser would. Notably, in the same post, Google encourages the principle of “progressive enhancement” for Web pages, which is related to the principles of partial understanding and backwards compatibility. Web developers are encouraged to present content first and foremost, with layers of complexity added corresponding to different complexities of browser implementations. With this principle in mind, users on very old browsers would see all the essentials of the document, possibly with a engaging or interactive presentation.

However, Google isn’t the only company that crawls the Web. Smaller search engines with less engineering resources may not have implemented the same full-scale crawling capability that Google has. Developers looking to scrape from Web sites would have a much harder time if the data is dynamically generated. For example, in my own tests, I’ve observed that Python’s urllib library does not run JavaScript before returning. If a developer wants to scrape text from a page written in AngularJS, they will need to somehow run JavaScript before collecting the data.

Even after Google’s 2015 announcement, issues still came up with SEO on sites running mostly JavaScript. For example, Google’s crawler failed to index shows on Hulu, the media streaming site. The site uses JavaScript to render much of its content, while at the same time preventing third parties from hosting their media (7). Another experiment showed that client-rendered Angular sites failed SEO tests. One Google employee was quoted as saying “if you care about SEO, you still need to have server-rendered content” (8).

Some front-end frameworks allow for client-side routing, implemented with JavaScript. That is, the URI changes on the client, without making an within the framework. Depending on how the framework implements routing, crawlers may or may not see the client-rendered routes. In one experiment, Google’s crawler gave up on using a React client-side router to render a page (9). This means that many client-routed websites are not properly indexed by Google. As more applications are designed with client-side routing, this issue could become more problematic for Google. Both Google and the website using client side routing lose value if these sites are not properly indexed.

There are a few takeaways from the issue of SEO with JavaScript frameworks. Developers who are focused on getting indexed in search engines should primarily render their content server-side. At the same time, developers can count on Google to use its resources to support indexing pages generated with popular frameworks, at least eventually. Also, Web developers should try to cater their pages to the largest possible set of users, whether those users are bots or people using an old version of Internet Explorer.

Issue: Page Bloat and Open Source

When dial-up connections were common in the 2000s, the 30kb jQuery package could take a significant amount of time to load. Since then, networks have become faster than ever, serving hundreds of megabits per second. However, with the rise of JavaScript frameworks, some Web pages now depend on many megabytes worth of JavaScript code. Developers have started to abuse the bandwidth available to them by introducing large libraries into applications that only need a tiny fraction of the library. Though this may not seem like a huge issue in itself, the embrace of large libraries leads to other problems.

JavaScript frameworks often encourage the use of third-party libraries to extend the features of the framework. For example, React in particular is known for its reliance on add-ons. Node package manager (NPM) makes it easier than ever to include open source code in websites. As a result, hundreds of thousands of lines of code could end up in relatively simple web applications. This code, gone uninspected, could be malicious or introduce bugs into the application. Some NPM packages are so heavily relied upon that changing them will “break the internet”. In one example from 2016, an NPM package was removed over a legal dispute about the name of the package (10). Web developers around the world stopped working after an eleven line NPM package (that React depended on) was unpublished. The package was a single function that left padded strings with a given character. For several hours on March 22, 2016, developers depending on the package were at a standstill.

On top of the lost productivity, security flaws could have easily been exploited. After the open source contributor responsible for the 11 line package unpublished all his NPM packages, global package names became available for registration. A malicious developer could acquire one of these global names, republish it, and introduce malicious code into sites that depend on the unpublished package (11). This is a huge security issue, and indicates a problem with blindly trusting that code will function as expected. This is more of a classic debate about open-source software: how much can we trust fellow developers? Whether or not we choose to trust them, there are security flaws to address with some modern frameworks.

Issue: Competition and Turnover

One difficulty of being a modern Web developer is the pace at which new technologies are developed. jQuery’s popularity decreased thanks to its lack of foresight into the single-page application era. Not all frameworks are designed with future applications in mind. As a result, new frameworks are created for to provide new functionality. Angular introduced bidirectional data binding, and React introduced immutable data (12). Trying to keep up with the next hot framework requires developers to be consistently on their toes. For many smaller frameworks, the developer community is too small to warrant using the framework. We can observe Metcalfe’s law at work: the most valuable frameworks are the ones with the most developers.

The framework landscape is constantly changing. Whereas HTML has remained mostly consistent (and backwards compatible) since inception, JavaScript frameworks quickly grow obsolete after a few years in the open (13). For example, right when AngularJS 1 was reaching its peak in popularity in 2016, Angular 2 was released. Angular 2 was essentially a full rewrite of the framework. Developers would need to learn the new framework or risk obsolescence.

Even so, new frameworks are adopted quickly, thanks to support from large tech companies (Angular and React are backed by Google and Facebook respectively) and network effects. Soon after release, JavaScript frameworks often develop a vibrant community of developers and a host of resources providing support. We can see the power of network effects in the growth of jQuery, Angular, and React.

Conclusion

Despite their flaws, we cannot deny the power of JavaScript and frameworks in Web pages. As mentioned before, single page interactive applications are only possible with AJAX and JavaScript frameworks. In the early days of JavaScript, codebases were sprawling and difficult to manage. In many cases, frameworks have introduced structure to JavaScript code. They have provided clean abstractions for manipulating the DOM and handling AJAX. Frameworks make building Web applications accessible for the average JavaScript developer. There is no shortage of developers on the Web that actively tout the advantages of using JavaScript frameworks.

Tim Berners-Lee advocated the “rule of least power” (14). That is, one should use the least powerful language suitable for a purpose. JavaScript has certainly evolved to do much more than its original purpose. The language offers powerful dynamic content and interactive experiences at the possible expense of simplicity and security.

Developers today are focused on building apps quickly that work across platforms both in the browser and natively. New frameworks like Electron, Ionic, and React Native make it easier for developers to do just that. The future of JavaScript may lie outside the browser.

How does the JavaScript community proceed into the future? Given the trends of the last decade, we should be fairly certain that frameworks will not disappear. However, designers can make choices to mitigate the issues they may run into when writing with frameworks. In terms of SEO, developers can choose to work with frameworks that play well with crawlers like the Googlebot. They can follow Google Webmaster standards to ensure their site is indexed correctly. To address security problems, central organizations like NPM should establish standards that protect Web pages from the negative effects of open source. The frameworks available will continue to change as long as the Web continues to evolve. These frameworks can be built for evolution, but may not anticipate the new ideas and technologies that developers want to implement.

Return to code