While looking at some Adobe technology, I came across James Wards site. James is a Technical Evangelist for Adobe. One of the links he has on his site is a Flash based performance application that "walks you through benchmarks for various methods of loading data in RIAs." The data was used at a conference to show the Adobe performance vs Dojo and general Ajax/JSON.
Note: If you try it yourself, I had problems using Firefox 3.0.x with his Flash App, so I recommend using Firefox 2.0
It's an Adobe Flex application that compares the various load times for:
- Ajax HTML - 5000 rows
- Ajax JSON - 5000 rows
- Dojo - 1000 Rows
- Flex E4X 5000, 20000 rows.
Visually, it displays a line graph showing the server exec time, transfer time, parse time, and the render time. Very useful stuff in a good layout.
What struck me, though, was the performance claim for Dojo and Ajax/JSON in general . Dojo was shown to be so slow, that he could only load 1000 rows and JSON was way too inefficient compared to AMF (Adobes Action Message Format).
What's up with that? Having been a dojo developer for some time now, I’ve never had the performance issues in my apps that the numbers the Census app claimed for various Ajax techniques.
So, in the parlance of Mythbusters, lets dig a little deeper …
My test configuration:
Before I begin my dive into James’ application, I’ll go through the machines I ran my tests on for a reference point. Naturally the power of a machine will affect the results of the test, so it’s only fair to disclose all of this up front:
Ajax Client:
Lenovo ThinkPad Z61P, Core2 DUO @ 2.0 GHZ, 2GB RAM.
Windows XP 32 bit SP3.Firefox 2
Internet Explorer 7
Opera 9.61
Safari 3.1
Google Chrome Beta
Server Side:
(For my JSON generation services and DB access used in server generated JSON performance tests)
DELL Workstation powered by an Intel Xeon EM64T 3.4 Ghz CPU, 2GB RAM
64 bit Linux running kernel level: 2.6.564 bit WebSphere Application Server, Version 6.1.0.15
32 bit Apache HTTP Server V2.0.47
Where possible, I’ve included the source code of parts of my tests. I hope to include even the server side pieces (basic servlet using SQL Datasource) in the future.
First off, I didn’t look at James’ application code. It is covered under GPL and since I am a direct contributor to Dojo as well employed by another large company, I cannot be source or algorithm contaminated, or even gives the appearance of misappropriation of code. So, all my analysis was done using James’ application as a black box and just looking at the types of requests it makes and the data returned.
To begin, his version of Dojo is ancient. He's using Dojo 0.4.3 as a comparison. All I have to say to that is OW! That is an extremely old version of Dojo and was well known to have performance issues. One of the major goals of Dojo 1.0 during the re-architecture of the library was to fix those performance issues. This included replacing the older O(n^2) filtering table with a virtual grid that could handle far, far, more rows efficiently by only paging in what rows were in view and only rendering what rows were in view.
While Dojo 1.x will handle a 1000+ tables, just as a principal of client side architectures, you rarely, if ever want to send a large table down to the client to parse. There are a couple reasons for that statement:
- No user can view 1000 rows at the same time, so why bother sending all the data in one shot? It often just wastes bandwidth and rendering time.
- Sorting large data sets in the client isn’t often efficient … or often good at keeping the order correct when compared to other pages in the overall set of data. Sure, you can sort 0-5000 rows … but what happens when you have 100,000, or worse, 1,000,000 rows you need to sort across? What if your client isn’t a computer with lots of memory (think handheld device like a Blackberry or iPhone)? Huge data sets just aren’t feasible to sort in a wide variety of clients. So, in most cases it’s better to leave the sorting to the programs designed for fast sorting and data lookups. Or to put it simply, let the database or service handle the sorting; the client should only be concerned about displaying it. A database is designed to sort 100,000+ rows, a web browser in a mobile device isn’t.
Dojo 1.X took those considerations into mind when developing the dojo.data API. It’s an abstraction later for accessing data services so that the user of a data store doesn’t have to know where sorting and the like occurs (it leaves it up to the data store implementation to decide.). Dojo itself provides several stores that can read data in various formats and expose them in a common way. Some are completely executed in the client; some use a client/server service model for accessing data. To data bound widgets like Grid, it isn’t even known where the sorting occurs. It asks for a page of data with X ordering applied and the store hands it back. It’s highly likely for huge data sets the data store is just making a call to a database service to sort and hand back the page.
But to avoid going further off in a tangent on dojo.data, the point is Dojo 1.X tries to be a lot smarter about where certain actions occur so that they are handled in the most efficient manner possible.
JSON vs AMF:
The more interesting comparison is JSON vs AMF. AMF began its life as a propriety protocol from Adobe and is a similar concept along the lines of Java's object serialization. It's a binary format that relies on strongly typed definitions of the data structures. Adobe has recently open sourced the specification in an attempt to gain wider acceptance. JSON, being a loosely typed language subset, includes structure details of how the data is represented.
JSON as an efficient transport:
It's the loosely typed structure of JSON that gives the developer a lot of power and flexibility in how to represent their data. It also provides a lot of opportunity at inefficiencies which may be the case of James's demo.
First off, JSON can be generated that compresses close to or just as well as AMF under GZIP. In fact, JSON should generally compress better, percentage wise of the original structure size than AMF. The reason for that gets into lossless compression theory and what is called the ‘entropy’ of your data, but that’s honestly a really dull subject and isn’t worth going into here. Think of it this way, text often compresses better than binary since text will tend to have lots of repeated sequences can be code-indexed, whereas binary tends to be more random and is harder to build long sequences of like data to be code-indexed.
The other thing I’ve heard about AMF is that for smaller data sizes, the JSON equivalent will be smaller. In fact, I’ll be taking a look at that in a bit. My basic understanding on why it is smaller is that AMF can extract out structural details into a header automatically, where JSON encodes it within the actual object constructs.
In either case, for AMF or JSON, if you’re sending a large amount of data you should always try to compress it using GZIP or DEFLATE filters if the client can support those encodings. This will drastically reduce ‘on the wire’ times. It was great to see that the Census App was already doing GZIP compression as a standard operation for both its Ajax/JSON and AMF example tests.
Now, you might be wondering since AMF likely encodes object structure details in its header automatically, will it always be smaller than JSON for large data sets? The short answer to that is no, it doesn’t have to be. Remember where I said JSON is flexible? Well, this is the point where its flexibility becomes really useful in reducing the size of a JSON payload.
So … let’s use JSON flexibility and see how we can improve a JSON payload size!
… and even better, lets even use the Census App data as our starting point so it’s readily obvious the compressed payload size can be reduced. Using Firebug, I was able to get the URL the application called to when it was loading its 5000 rows of JSON data. For reference, it’s:
http://www.jamesward.com/census/servlet/CensusServiceServlet?id=ajax_json&command=getJSON&rows=5000&gzip=true
First thing to notice is that all the rows are homogeneous. Meaning, all the rows have the same attributes. Homogeneous data can encode format information in the header of the JSON data instead of being redundant in each object. This can be done by converting all the JSON objects into Arrays, so that the items property is just an array of arrays of values, where the index of the array maps to some string name in a ‘cols’ array. See the following for clarification
{
"cols": ["field0","field1","field2", ...},
"items" : [["field0Val", "field1Val", ....],
...]
}
So, if we take the payload from the servlet and apply this formatting we get:
GZIP JSON for that payload: 38.1 K.
AMF is now only 78% the size of JSON at 5000 rows.
Good improvement! But, we can still do better…
Data with only finite values can be represented as integers instead of strings. Or effectively, represent finite values as enumerations instead of a String type. Again, their data provides places where this can be neatly applied. For example, look at the ‘sex’ field. This can only have values of "Male" or "Female". So, why not represent those as numbers in JSON instead?
For example, represent "Male" as 0, and female as 1. This will encode each field as one byte instead of six. For more generally, You can consider this optimization using a formatting of:
{
"cols": ["field0","field1","field2", ....},
"enums": { "field0": ["enum1", "enum2",...]},
"items" : [["field0Val", "field1Val", ...],
...]
}
Okay, now GZIPing the payload with this optimization applied.
GZIP JSON for that payload: 37.2 K.
AMF is now only 81% the size of JSON at 5000 rows.
Great! More size improvement. It’s not as drastic as the first one, but it’s still a few percent.
Graphically, applying optimizations to the JSON payload shows the following trends in size reduction:
Figure 1: JSON GZIP Size Versus Applied Optimizations:

We certainly go on with the optimization hunt for James's JSON data, but I think the point has been made: JSON’s flexibility grants you the ability to morph your wire format to improve efficiency of the transfer for your specific application. As demonstrated, with only a little adjustment, I was able to reduce James’ application data to within seven kilobytes of AMF at 5000 rows. Could it be reduced further by enumerating out more common fields? Yes.
But … lets go back to a comment I made earlier. Remember were I said that AMF was larger than JSON for smaller payload sizes? Well, now is where I can prove it and all by using James’ application. Using Firebug again, I was able to identify the request that was generating AMF. The URL is:
http://www.jamesward.com/census/flex_amf3.html?rows=5000&gzip=true
Great, it even let me specify the row count, just like the JSON URL would. So … we can explore payload sizes. But, this turned out to be a little harder than I first expected, because that URL is only one of three requests the application must make to get row data. Yep, that’s right, where the JSON service is a single request, the AMF service is actually three requests. You can see that below in my screenshot from a debugging proxy:
I should point out here that his application doesn’t seem to include the transfer times and sizes of the intermediary data sent. Should it? Probably, as it’s all necessary information for his AMF/Flex application to render. But, that’s neither here nor there at this point. I said we would look at the data payload sizes, so that will be the focus. To get the AMF payload size I had to use the debugging proxy to locate each data package as I scaled it from 0 rows up to 12400 rows. The results were interesting.
Figure 2: AMF versus JSON payload size:

Okay, so it does look like up to roughly 200 rows, the payloads are pretty much the same size. But, are they really? Given the scaling on the Y axis, it’s hard to see the differences in payload size until 400 or so rows. So … let’s adjust the Y axis a bit so that it’s not a linear scale.
Figure 3: AMF versus JSON payload size (logarithmic):

Aha! Okay, now we can really see the difference. Up to about 80 to 100 rows, the JSON payload is actually smaller than AMF for the data. Why? Remember when I said AMF likely encodes information in the headers about the structure of the object? Well, in smaller payloads, that information actually ends up larger than the actual data being sent. Simply put, its overhead is larger than JSON’s for smaller data sizes. This also implies something else … that JSON would be a better format for paging data (data in more consumable chunks, such as 50 rows at a time, which is about the size of a printed page) than AMF. And remember! This is James’ data without any of the optimizations I went through earlier. I could probably make the JSON payloads even smaller by applying those changes. But, this analysis has gone on long enough as it is and there is another topic I want to bring up with regards to his application and the times it shows.
So … moving right along to the next topic.
JSON takes longer to parse?
Wow...In James's demo, he claimed that it took Firefox 2 1.3 seconds to parse the JSON data. Maybe he's right - using Dojo 0.4.3 or some other parser format … or maybe he’s somehow including the wire transfer time in it … but regardless, that number seems rather large and went against all my experiences on how fast JSON parses.
So … once more into the Myth busting! As with my previous tests I took the data output from James’ servlet and stored it on an Apache server with GZIP enabled. This was to mimic server request and gzip. From there I made a very simple HTML page that just used dojo’s fromJson() to time how long it takes the browser to parse the JSON text back into a JavaScript object. In fact, the code I used can be seen below:
<script type="text/javascript" src="dojo.js"
djConfig="isDebug: true, parseOnLoad: true, usePlainJson: true"></script>
<script type="text/javascript">
function getGridData() {
dojo.xhrGet( {
url: "test.json",
handleAs: "text",
timeout: 60000,
load: function(response, ioArgs){
var sTime = new Date().getTime();
var obj = dojo.fromJson(response);
var eTime = new Date().getTime();
var delta = eTime - sTime;
alert("Total processing time: " + delta/1000);
},
error: function(error,args){
console.warn(error);
}
});
}
dojo.addOnLoad(getGridData);
</script>
<style type="text/css">
</style>
<title>DojoGrid</title>
From there I just loaded that page into a variety of web browsers. In each one I ran it four times and cleared cache between each run. The results got were far, far better than the James’ application claim of 1.3 seconds (in the same Firefox browser). And even more interesting was that I got a result for one browser I didn’t expect. Please see the following chart.
Figure 4: JSON Parse time for Various Browsers Versus James’ Application claim:

Dojo's 1.2.1 version of dojo.fromJson() to parse the same JSON in the same browser took 0.187 or so on average in Firefox 2.0. That looks much, much better than 1.3 seconds (the black bar), doesn’t it? The biggest surprise was Internet Explorer 7 actually had the best JSON parse time out of all of them at 0.078 seconds.
Now that I saw a much better parse time than his test app claimed … I looked a little harder at it and noticed something that seemed strange to me. Consistently the transfer time for an AMF payload was much longer than the JSON payload according to his app. Wait, doesn’t that conflict with the statement his app made about how AMF is smaller than JSON? If AMF is indeed smaller, then transfer (on the wire), time for it should be less than JSON, shouldn’t it? That piqued my deductive interest as that goes against what would be expected! So … I did some further analysis of his numbers…
The application claimed that AMFs claimed parse time is 0 seconds. Okay, seems a bit strange, but parse time could just be really, really, fast. But … I have another theory that explains the parse time being zero and why AMF transfer time is longer:
My Theory:
The stated AMF transfer time is in reality the AMF transfer and parse time. I believe that what his app calculated for the transfer time includes the parsing and object load time the Flex engine goes through with an AMF stream because I suspect he has no real way to get the parse time separate from the transfer (wire only), time, so his graph includes them all as transfer time and leaves parse time at 0.
Now, I may be wrong with that deduction, but consider the following:
If we take his number for JSON transfer (wire) time and add it to the JSON parse time I calculated, we get a very interesting value … around 0.7 seconds. This value is interesting because it almost identical to the AMF transfer time, which was shown in my run as 0.874 seconds! So, if I am correct in my deductions … this would mean the transfer and load time for AMF versus JSON is roughly the same. That seems reasonable as the data is roughly the same size and both Flex and JavaScript are virtual machine interpreters.
Now, obviously when you run the same tests you may see some number variations. Everything is affected by the speed of the machine you run the test on, the browser version, the Flash version, and so on. The key thing to look at is not the numbers separately, but the numbers relative to each other on the same system.
Okay, so, this is getting long. Please bear with me, I have just one more topic I want to cover.
Server generated JSON takes longer to create than server generated AMF?
In James's demo, I didn't see any real context of how he was creating his JSON data on the server side. Maybe it was hard-coded, or maybe he was using an open source JSON library, not certain since I did not look at his code to avoid contamination concerns.
The test I did took an Apache Derby database containing 100,000 rows and took the results and used JSON4J to convert it to a
JSON data stream. The dataset contained columns similar to the ones they used. I did not see anything close to server generation time they claim of 2.1 seconds. The JSON parser and generation library I used was the JSON4J library out of the WebSphere Application Server Feature Pack for Web 2.0. The server side was WebSphere Application Server, version 6.1.
What I found in my simple Datasource (JDBC), servlet that I found that doing a basic SELECT * FROM PEOPLE on my table and returning the first 5000 rows in JSON (Using the JSON4J parser), I saw server generation time of: 0.255 seconds. This is much closer to their AMF time of: 0.078 seconds.
I decided to go a bit further and try to see what just the JSON serialization cost. So I actually removed generating JSON. I wanted to see how much server time was spent just iterating over the database rows and accessing fields. What I determined was that it took the server 0.100 seconds just to iterate over the 5000 rows, but not actually serialize anything! This is greater than their claimed time for DB query, load AMF data objects, and serialize. And using that number of 0.255 seconds and the DB iteration time of 0.100 seconds, I can deduce the JSON server overhead time is about 0.155 seconds using the JSON4J utility library. Still nowhere near the 2.1 seconds his application claimed it took his server to generate JSON.
So … the numbers raise the question of what the Flex application is doing for AMF generation up to serialization of the data on the server side. I can’t see how they could be using the same DB queries and the like and the only difference is in serialization to have such huge differences in numbers.
Anyway, that question remains open. I’m not sure how to continue on further examination of the server side claims without knowing more about what his application did for AMF generation, and I can’t look at that because of contamination concerns. So, if anyone can shed some light on what they’re doing and how it can show faster numbers than even a simple DB row traversal, I would love to know.
In summary:
In closing, I would like to say that this investigation has confirmed a few things for me:
The first is that Flash and Flex are good technologies and hold up very well to large data sets and use efficient formats. Adobe has done a great job in designing them. My intention with this post was not to attack Flash, only some of the claims made about its superiority to Ajax techniques. I just didn’t feel that James’ Census application gave a completely fair view of how Ajax/JSON can be used well in applications.
The second thing it confirmed for me is that while Flash and Flex are nice tools, you don’t need them to create well-performing Ajax/JSON applications. As my analysis showed, through good construction of your data payloads, you can get even large payload size close to that of a binary protocol such as AMF. And if you’re doing paging style access, well … smaller payloads are more size efficient in JSON than they are in the AMF binary format.
Lastly, it showed me through my work on a simple server side servlet that just issued SQL queries and returned the data as JSON is that server-side JSON rendering can be very fast. It’s certainly not the horrible performer James’ app tries to make it out to be.
So, if you take anything from this, I hope it is better understanding that Flash and Flex are good tools, but the Open Web (Ajax), techniques can be just as good and in reality, just as efficient.
| Attachment | Size |
|---|---|
| Size.jpg | 23.42 KB |
| debuggingproxy.jpg | 42.32 KB |
| SizeVersusRowsL.jpg | 23.43 KB |
| SizeVersusRowsN.jpg | 26.89 KB |
| ParseTime.jpg | 34.02 KB |


Couple of things to note...
...this Census app is actually pretty old; when he wrote it, he used the original 0.4 code (ie 0.4.0). That code is the one with the O(n^2) access on FilteringTable.
With either 0.4.1 or 0.4.2 (I can't remember), I implemented hashtable lookup on the data, which changed it to O(1).
The other thing that he neglected to mention (either because he didn't know or was busy writing Census, which does have a nice interface) is that the majority of the time taken by FilteringTable most definitely was *not* the data parsing, but the rendering of rows in a browser--and that has to do with most browser's 450 row limit before things go off the deep end performance-wise.
Though I'll admit that what I should have done in the end, with the sorting, was to pop the table out of the document, sort it, and then pop it back in.
It's old, but the scary part
It's old, but the scary part is people still point to it was proof that Flash is so much better than Ajax and dojo in particular. I found a lot of its claims very suspect, in all honesty.
-- Jared
Flash and Flex is a nip-and-tuck race
Flash supports better for cookies, but AJAX seems to be a little more search engine friendly even if not optimal.
Flex 3 documentation
I find it rather interesting that Flex 3 documentation is done in part using Yahoo UI widgets.
Take a look at the Flex tree component doc:
http://livedocs.adobe.com/flex/3/html/help.html?content=dpcontrols_8.htm...
The tree on the left is a JavaScript widget. Why they didn't use the Flex Tree component?
That Old Census Demo
Hi,
Thanks for doing a more in-depth analysis of the Census demo. Admittedly it is old and uses an old version of Dojo. Yet the purpose of the demo is to show two things which still stand true with newer versions of Dojo:
1) When transferring large data sets across the wire it's beneficial to have a binary protocol
2) With a capable VM on the client data operations like sorts or filters should be done on the client
I do not at all intend to point the finger at Dojo. It's a great Ajax framework but inherently limited by JavaScript in the browser. In newer versions of Dojo and in many use cases the limitations of JavaScript don't have an impact on the end result. But in the case of large data sets these limitations do have an impact. That is what Census is trying to show.
BTW: I'm trying to get Census updated to the newest version of Dojo. Hopefully I get this done soon. But Census is open source so if anyone wants to help I'd be happy to have it!
-James
Firefox 3
BTW: The Firefox bug is a change in FF3 with how iframes are handled. I need to fix that.
-James