debuggable

 
Contact Us
 
8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16

XHTML died alone, the semantic web is next

Posted on 20/7/09 by Felix Geisendörfer

Story time:

Thursday, July 2nd 2009. Officials announce the death of XHTML2. Multiple suspects, including HTML4, HTML5 and XHTML1 have been taken into pre-trial custody.

The investigation is difficult. XHTML1 seems to have no motive as XHTML2 was his son. The father can barely speak about his loss without bursting into tears. However, he is a very strict man and everybody knew he had extremely high expectations in his son. Could his bad temper have killed XHTML2 for making a simple mistake?

HTML4 however has a motive. He was once the big star in town. But then came XHTML1 and took that fame away. Every theater, including "the Fire Fox", "the Opera" and "the Web Kit" loved XHTML1 and told HTML4 it is time to retire. Not only that, many critics also saw great talent in his son XHTML2 who was competing with HTML5. So did HTML4 kill XHTML2 out of revenge, to free the way for his own son?

The prime suspect of course is HTML5. He was a direct competitor to XHTML2. The two had very different professional opinions. XHTML2, just like his father, was a strict perfectionist. HTML5 however is often characterized as a forgiving, practical performer. But investigators have found that HTML5 had the support of many cutting-edge theaters since at least 2004 and had no reason to fear XHTML2's competition. Did HTML5 feel he had to kill XHTML2 to simply destroy his strict philosophy?

3 years later. Investigations on the XHTML2 case have long been closed. XHTML2's death was classified as suicide. The media said XHTML2 could not gain the support of any major theater and killed himself out of desperation.

Now a new case is erupting the news landscape. HTML5 had a step brother, XHTML5, and his body was just found in a motel room. He was the new leader and last hope of a movement called "the semantic web". Probable cause of death - microformat overdose.

XHTML5's death marks the end of the movement. He and his followers thought they could make performance art accessible to a new audience of robots and intelligent machines. Those machines were very hungry for information, but could not understand performers like HTML4 and HTML5. "The semantic web" thought that by enforcing the strictness invented by XHTML1 and through the heavy injection of microformats they could perform at a level of perfection that could bridge the gap between the visuals and the underlaying information.

From early on, the "semantic web movement" was very violent. Non-supporters were labeled as reactionists. The standards developed by the movement made them totally blind for reality. The biggest theater, known for its greedy operators and huge audience, "the Internet Explorer" simply ignored the movement.

But then came JSON, a theater janitor who would permanently transform the stage. JSON himself was raised by JavaScript, a class of machines invented at the old "Netscape" theater. The JavaScript machines were originally only used to do basic lighting and other low-level work at the theaters and nobody really understood their true power. But through JSON they were able to communicate with other machines. Machines who previously only talked to XHTML1's grandfather - XML.

JSON just happened to be much more eloquent than XML, and his slang became quickly popular among a new generation of machines. And he became friends with HTML4 and HTML5. They could not understand each other directly, but the JavaScript machines were able to express JSONs thoughts to HTML4 & 5. So HTML4 & 5 realized that JSON could take over XML's job as a playwright.

And so the stage was slowly transformed. Plays were written by JSON who then performed them for the machines. JavaScript and other machines enabled HTML4 & 5 to perform the plays in their unique ways that were still more attractive to humans.

The XHTML family died because it wanted to write and perform for humans and machines alike. But the machines were not ready for it yet and wanted their own performances. Accepting this truth ultimately helped the humans who could now enjoy more entertainment & knowledge than ever, thanks to their automated friends.

=== The End ===

If you did not like the performance, entrance fees are unfortunately non-refundable. But feel free to release your grief over the death of XHTML in the comments.

-- Felix Geisendörfer aka the_undefined

 

Debuggable.com 2.0

Posted on 15/7/09 by Felix Geisendörfer

We removed our old two-guys-in-a-garage theme and replaced it with the beautiful work of Abhay Singh. We could have kept the old look, but then we would have also had to buy a garage at some point.

We are now also offering commercial support for CakePHP, jQuery and Git.

Let us know what you think. That is other than that we should post more. We got a few goodies queued up already ; ).

-- Felix Geisendörfer aka the_undefined

PS: Be careful with our RSS bug, he does not like to be annoyed.

 

Summary of CakeFest #3 - Berlin

Posted on 15/7/09 by Felix Geisendörfer

Hey folks,

if you are on twitter, you probably heard that cakefest #3 was beyond awesome again this time around.

Amongst other things, Nate spoke about the announcement of Cake3 which is a largely rewritten version of CakePHP that works exclusively with PHP5.3+ and takes advantage of all the new features like namespaces and closures. You can already check the source, but hold your breath - this version is many many months away from a release.

On the first day, Gwoo talked about the more immediate future of CakePHP. CakePHP 2.0 will pretty much be CakePHP 1.2 (ok, actually 1.3) with PHP5 strict-mode compliance. There should be no breaks in backwards computability, but we expect to see a significant performance boost from dropping PHP4 support.

There was a lot of discussion up front on whether Cake3 should be based on the current core or whether it should be a rewrite like it is now. We hope that the benefits of a new architecture will outweigh the problems of backwards compatibility in the long run. It will certainly be interesting to see the uptake of 5.3 as it is a much more significant revision than the version number would indicate.

Anyway, back to the past. The conference was preceded by a 2-day workshop. There was some trouble getting started since we intended to hand out a copy of the app we were going to build that had just the design with it, but falling back to an empty core worked as well. Over the 2 days Nate & Mariano did a fantastic job building a twitter-clone from scratch that also interacts with the actual twitter via a datasource and demonstrates many other CakePHP tools. I think we hit an even better compromise of not overloading beginners and making it still attractive to intermediates, but there also seemed to be some demand for an advanced-CakePHP only workshop.

The conference itself was quite fantastic as well. We had ~70 people at the event, which was a bit of a challenge when trying to pick bars & restaurants at night : ). Pretty much all presentations were fantastic and seemed to be a good mix of different topics.

Garrett also introduced us to a new gentleman game that he has helped to develop named: Fork Master. The game is beyond fantastic and there is an internationally rapidly growing community of players now : ).

-- Felix Geisendörfer aka the_undefined

PS: Graham is uploading all CakeFest slides right now as well as writing articles covering the talks in detail.

Update 1: Just saw that Kevin also has a fantastic post on the things he picked up at cakefest.

 

Sales Almost Closing for CakeFest#3 in Berlin!

Posted on 6/7/09 by Tim Koschützki

Hey folks,

There are only around 10 hours left until sales close for the up until now biggest CakeFest. It will take place from Jul 9 until Jul 12 in Debuggable's hometown Berlin, Germany. It is in fact two events in one: A CakePHP workshop and the main conference with talks presented by core developers and community members.

The CakePHP Workshop

The CakeFest CakePHP Workshop (July 9-10) provides an opportunity for beginner and intermediate developers to jump-start their experience with CakePHP by learning directly from the CakePHP core team. Participants are strongly encouraged to bring their laptops, as they will be learning hands-on how to set up and build CakePHP applications.

Nate and Mariano will talk about topics including basic application design, expert project workflow techniques, interacting with databases and web services, JavaScript and Ajax, and much more. The presentations are classroom-style allowing participants to follow along and ask questions. Check out the workshop schedule.

Felix, others and myself will be on-hand to provide one-on-one help. If you are a designer and the workshop pace is quite fast for you or if you are experienced, but have a difficult architectural problem that you need help with: The workshop and the one-on-ones will get you up and running and will provide all necessary help.

The Main Conference

The CakeFest Conference (July 11-12) is the primere event for CakePHP developers of all levels to meet, socialize, and learn collaboratively. Attendees will be immersed in a collective learning environment where some of the coolest ideas of the CakePHP community will be discussed and presented.

Topics include - but aren't limited to - Demystifying Webservices in CakePHP, Building Custom APIs and JavaScript for PHP Developers. Check out the full schedule and head on over and get your ticket.

What do we do in the evenings?

The evenings will be packed with fun dinners, bar visits and club visits. If you thought discussions would end then you are wrong. The CakeFests from the past proved to have very enjoyable evenings with fun action going on.

How many folks will be there?

Just to give you an idea about how many folks will be attending. As of now there are:

  • 9 people from the CakePHP Development Team
  • 35 people attending the workshop
  • 62 people attending the conference

.. attending.

So you can see, there are tons of opportunities to network, discuss and learn about new and old things. You will be able to meet core team members and discuss your applications, experience and wishes with them. In the evenings we will hang out, drink one or two beers, and just have fun! And all of that at affordable prices!

So, what are you waiting for? Hop on over and grab your ticket. :)

See you all there!

-- Tim Koschuetzki aka DarkAngelBGE

 

CouchDB Insert Benchmarks

Posted on 25/6/09 by Felix Geisendörfer

Hey folks,

I am currently working on replacing Amazon S3 as the key value storage service for Debuggable's new startup. The main reason for that choice is that we want the ability to license the technology for in-house usage, which means the S3 dependency has to go.

Over the past year or so I have repeatedly heard good things about CouchDB (not to mention that Damien Katz became a personal hero of mine after seeing this video). But the main reason for preferring CouchDB over all the alternative key value stores is CouchDB's simplicity. Using HTTP + JSON as the protocol and embracing complete RESTfulness makes getting started with CouchDB ridiculously easy. It also makes it very awesome as you can build your architecture with other HTTP tools such as reverse proxies, http load balancers, etc..

Anyway, having done absolutely nothing with CouchDB before, I feel I should avoid shooting myself in the foot by making poor performance / scaling assumptions that may or may not be true for our particular use cases. So I did what every new user of an open source project would should do and setup some benchmarks.

However, the whole point of these is not to find out how fast CouchDB really is - I don't care as long as it scales horizontally. No, I am mostly interested in finding out 2 things:

  • a) How much slacking time do I have left before we seriously need to scale out to multiple CouchDB nodes
  • b) When that day comes, what can we expect in terms of replication delay (eventual consistency) in a multi-master setup

So far I have been mostly studying question a). Since our application is mostly storage heavy, but does not necessarily have lots of read hits (see, this is the point where you should realize my tests might not help answer your questions at all - sorry), I was wondering about the disk space consumption relative to the number of documents stored in the database.

After reading a bit about the B+ tree CouchDB uses for storage, I assumed that the required disk space would grow linear with the number of documents stored. However, my initial tests indicated that the disk space / document was growing with each document I was adding. Assuming a bad setup and after discussing this a bit in #couchdb, I decided to create a more serious setup on Amazon Ec2 inserting anywhere from 0 - 1 million records.

I put all the code for setting up the environment and running the tests on GitHub, see my couchdb-benchmarks project.

Anyway you waited long enough. Time for my initial results:

doc count (before compact): 0
doc count (after compact): 0
insert time: 0 sec
insert time / doc: n/a ms
compact time: 1.0064 sec
compact time / doc: n/a ms
disk size (before compact): 79 bytes
disk size (after compact): 79 bytes
.couch size (before compact): 79 bytes
.couch size (after compact): 79 bytes
.couch size / doc (before compact): n/a bytes
.couch size / doc (after compact): n/a bytes

doc count (before compact): 1
doc count (after compact): 1
insert time: 0.0015 sec
insert time / doc: 1.46 ms
compact time: 1.0051 sec
compact time / doc: 1005.14 ms
disk size (before compact): 315 bytes
disk size (after compact): 4179 bytes
.couch size (before compact): 315 bytes
.couch size (after compact): 4179 bytes
.couch size / doc (before compact): 315 bytes
.couch size / doc (after compact): 4179 bytes

doc count (before compact): 2
doc count (after compact): 2
insert time: 0.0015 sec
insert time / doc: 0.75 ms
compact time: 1.0057 sec
compact time / doc: 502.87 ms
disk size (before compact): 503 bytes
disk size (after compact): 8281 bytes
.couch size (before compact): 503 bytes
.couch size (after compact): 8281 bytes
.couch size / doc (before compact): 251.5 bytes
.couch size / doc (after compact): 4140.5 bytes

doc count (before compact): 3
doc count (after compact): 3
insert time: 0.0017 sec
insert time / doc: 0.56 ms
compact time: 1.0053 sec
compact time / doc: 335.1 ms
disk size (before compact): 693 bytes
disk size (after compact): 8281 bytes
.couch size (before compact): 693 bytes
.couch size (after compact): 8281 bytes
.couch size / doc (before compact): 231 bytes
.couch size / doc (after compact): 2760.33 bytes

doc count (before compact): 4
doc count (after compact): 4
insert time: 0.0017 sec
insert time / doc: 0.44 ms
compact time: 1.0053 sec
compact time / doc: 251.33 ms
disk size (before compact): 883 bytes
disk size (after compact): 8281 bytes
.couch size (before compact): 883 bytes
.couch size (after compact): 8281 bytes
.couch size / doc (before compact): 220.75 bytes
.couch size / doc (after compact): 2070.25 bytes

doc count (before compact): 5
doc count (after compact): 5
insert time: 0.0018 sec
insert time / doc: 0.36 ms
compact time: 1.0054 sec
compact time / doc: 201.08 ms
disk size (before compact): 1071 bytes
disk size (after compact): 8281 bytes
.couch size (before compact): 1071 bytes
.couch size (after compact): 8281 bytes
.couch size / doc (before compact): 214.2 bytes
.couch size / doc (after compact): 1656.2 bytes

doc count (before compact): 6
doc count (after compact): 6
insert time: 0.0019 sec
insert time / doc: 0.32 ms
compact time: 1.0051 sec
compact time / doc: 167.52 ms
disk size (before compact): 1261 bytes
disk size (after compact): 8281 bytes
.couch size (before compact): 1261 bytes
.couch size (after compact): 8281 bytes
.couch size / doc (before compact): 210.17 bytes
.couch size / doc (after compact): 1380.17 bytes

doc count (before compact): 7
doc count (after compact): 7
insert time: 0.0021 sec
insert time / doc: 0.3 ms
compact time: 1.005 sec
compact time / doc: 143.57 ms
disk size (before compact): 1459 bytes
disk size (after compact): 8281 bytes
.couch size (before compact): 1459 bytes
.couch size (after compact): 8281 bytes
.couch size / doc (before compact): 208.43 bytes
.couch size / doc (after compact): 1183 bytes

doc count (before compact): 8
doc count (after compact): 8
insert time: 0.0022 sec
insert time / doc: 0.27 ms
compact time: 1.0049 sec
compact time / doc: 125.61 ms
disk size (before compact): 1655 bytes
disk size (after compact): 8281 bytes
.couch size (before compact): 1655 bytes
.couch size (after compact): 8281 bytes
.couch size / doc (before compact): 206.88 bytes
.couch size / doc (after compact): 1035.13 bytes

doc count (before compact): 9
doc count (after compact): 9
insert time: 0.0023 sec
insert time / doc: 0.25 ms
compact time: 1.0053 sec
compact time / doc: 111.7 ms
disk size (before compact): 1849 bytes
disk size (after compact): 8281 bytes
.couch size (before compact): 1849 bytes
.couch size (after compact): 8281 bytes
.couch size / doc (before compact): 205.44 bytes
.couch size / doc (after compact): 920.11 bytes

doc count (before compact): 10
doc count (after compact): 10
insert time: 0.0025 sec
insert time / doc: 0.25 ms
compact time: 1.0042 sec
compact time / doc: 100.42 ms
disk size (before compact): 2043 bytes
disk size (after compact): 8281 bytes
.couch size (before compact): 2043 bytes
.couch size (after compact): 8281 bytes
.couch size / doc (before compact): 204.3 bytes
.couch size / doc (after compact): 828.1 bytes

doc count (before compact): 50
doc count (after compact): 50
insert time: 0.0072 sec
insert time / doc: 0.14 ms
compact time: 1.0038 sec
compact time / doc: 20.08 ms
disk size (before compact): 10319 bytes
disk size (after compact): 16473 bytes
.couch size (before compact): 10319 bytes
.couch size (after compact): 16473 bytes
.couch size / doc (before compact): 206.38 bytes
.couch size / doc (after compact): 329.46 bytes

doc count (before compact): 100
doc count (after compact): 100
insert time: 0.0136 sec
insert time / doc: 0.14 ms
compact time: 1.0054 sec
compact time / doc: 10.05 ms
disk size (before compact): 20430 bytes
disk size (after compact): 24665 bytes
.couch size (before compact): 20430 bytes
.couch size (after compact): 24665 bytes
.couch size / doc (before compact): 204.3 bytes
.couch size / doc (after compact): 246.65 bytes

doc count (before compact): 500
doc count (after compact): 500
insert time: 0.0687 sec
insert time / doc: 0.14 ms
compact time: 1.0062 sec
compact time / doc: 2.01 ms
disk size (before compact): 104616 bytes
disk size (after compact): 110690 bytes
.couch size (before compact): 104616 bytes
.couch size (after compact): 110690 bytes
.couch size / doc (before compact): 209.23 bytes
.couch size / doc (after compact): 221.38 bytes

doc count (before compact): 1000
doc count (after compact): 1000
insert time: 0.1361 sec
insert time / doc: 0.14 ms
compact time: 1.003 sec
compact time / doc: 1 ms
disk size (before compact): 212260 bytes
disk size (after compact): 217186 bytes
.couch size (before compact): 212260 bytes
.couch size (after compact): 217186 bytes
.couch size / doc (before compact): 212.26 bytes
.couch size / doc (after compact): 217.19 bytes

doc count (before compact): 2500
doc count (after compact): 2500
insert time: 0.4686 sec
insert time / doc: 0.19 ms
compact time: 1.006 sec
compact time / doc: 0.4 ms
disk size (before compact): 814957 bytes
disk size (after compact): 819298 bytes
.couch size (before compact): 814957 bytes
.couch size (after compact): 819298 bytes
.couch size / doc (before compact): 325.98 bytes
.couch size / doc (after compact): 327.72 bytes

doc count (before compact): 5000
doc count (after compact): 5000
insert time: 0.9165 sec
insert time / doc: 0.18 ms
compact time: 1.0065 sec
compact time / doc: 0.2 ms
disk size (before compact): 2012394 bytes
disk size (after compact): 2015330 bytes
.couch size (before compact): 2012394 bytes
.couch size (after compact): 2015330 bytes
.couch size / doc (before compact): 402.48 bytes
.couch size / doc (after compact): 403.07 bytes

doc count (before compact): 7500
doc count (after compact): 7500
insert time: 1.5116 sec
insert time / doc: 0.2 ms
compact time: 2.0112 sec
compact time / doc: 0.27 ms
disk size (before compact): 3778774 bytes
disk size (after compact): 3797090 bytes
.couch size (before compact): 3778774 bytes
.couch size (after compact): 3797090 bytes
.couch size / doc (before compact): 503.84 bytes
.couch size / doc (after compact): 506.28 bytes

doc count (before compact): 10000
doc count (after compact): 10000
insert time: 2.3111 sec
insert time / doc: 0.23 ms
compact time: 3.015 sec
compact time / doc: 0.3 ms
disk size (before compact): 5653905 bytes
disk size (after compact): 5652578 bytes
.couch size (before compact): 5653905 bytes
.couch size (after compact): 5652578 bytes
.couch size / doc (before compact): 565.39 bytes
.couch size / doc (after compact): 565.26 bytes

doc count (before compact): 25000
doc count (after compact): 25000
insert time: 6.8684 sec
insert time / doc: 0.27 ms
compact time: 7.0746 sec
compact time / doc: 0.28 ms
disk size (before compact): 20595235 bytes
disk size (after compact): 20635746 bytes
.couch size (before compact): 20595235 bytes
.couch size (after compact): 20635746 bytes
.couch size / doc (before compact): 823.81 bytes
.couch size / doc (after compact): 825.43 bytes

doc count (before compact): 50000
doc count (after compact): 50000
insert time: 15.8227 sec
insert time / doc: 0.32 ms
compact time: 14.1612 sec
compact time / doc: 0.28 ms
disk size (before compact): 51808040 bytes
disk size (after compact): 51724386 bytes
.couch size (before compact): 51808040 bytes
.couch size (after compact): 51724386 bytes
.couch size / doc (before compact): 1036.16 bytes
.couch size / doc (after compact): 1034.49 bytes

doc count (before compact): 100000
doc count (after compact): 100000
insert time: 35.3071 sec
insert time / doc: 0.35 ms
compact time: 33.4723 sec
compact time / doc: 0.33 ms
disk size (before compact): 125497442 bytes
disk size (after compact): 125419618 bytes
.couch size (before compact): 125497442 bytes
.couch size (after compact): 125419618 bytes
.couch size / doc (before compact): 1254.97 bytes
.couch size / doc (after compact): 1254.2 bytes

doc count (before compact): 250000
doc count (after compact): 250000
insert time: 104.0009 sec
insert time / doc: 0.42 ms
compact time: 97.3738 sec
compact time / doc: 0.39 ms
disk size (before compact): 394489375 bytes
disk size (after compact): 394457190 bytes
.couch size (before compact): 394489375 bytes
.couch size (after compact): 394457190 bytes
.couch size / doc (before compact): 1577.96 bytes
.couch size / doc (after compact): 1577.83 bytes

doc count (before compact): 500000
doc count (after compact): 500000
insert time: 230.6021 sec
insert time / doc: 0.46 ms
compact time: 209.0139 sec
compact time / doc: 0.42 ms
disk size (before compact): 900271866 bytes
disk size (after compact): 900280422 bytes
.couch size (before compact): 900271866 bytes
.couch size (after compact): 900280422 bytes
.couch size / doc (before compact): 1800.54 bytes
.couch size / doc (after compact): 1800.56 bytes

doc count (before compact): 750000
doc count (after compact): 750000
insert time: 354.7959 sec
insert time / doc: 0.47 ms
compact time: 380.9895 sec
compact time / doc: 0.51 ms
disk size (before compact): 1446452532 bytes
disk size (after compact): 1445376102 bytes
.couch size (before compact): 1446452532 bytes
.couch size (after compact): 1445376102 bytes
.couch size / doc (before compact): 1928.6 bytes
.couch size / doc (after compact): 1927.17 bytes

doc count (before compact): 1000000
doc count (after compact): 1000000
insert time: 487.3284 sec
insert time / doc: 0.49 ms
compact time: 570.2633 sec
compact time / doc: 0.57 ms
disk size (before compact): 2023280441 bytes
disk size (after compact): 2022334566 bytes
.couch size (before compact): 2023280441 bytes
.couch size (after compact): 2022334566 bytes
.couch size / doc (before compact): 2023.28 bytes
.couch size / doc (after compact): 2022.33 bytes

From this data, a few assumptions can be made:

  • CouchDB inserts ~2-3k documents / second in a >100k documents database (for this particular hardware / benchmark setup)
  • CouchDB inserts get slower on bigger databases
  • CouchDB seems to use more bytes / document the larger the database gets (this is scary, but might explain the previous 2 observations)
  • The time it takes for compacting a database with identical, unmodified documents seems to be almost equal to the time it took to insert the initial documents. Makes lots of sense assuming the writes are I/O bound.

Now since I am new to CouchDB there is a very large chance for problems with my setup and the logic behind my assumptions. However, I hope that by sharing them I can get feedback to make the benchmarks better and provide explanations for the observed characteristics.

I also hope that some of you might feel compelled to fork the project on GitHub and provide more benchmarks. Personally I am going to work on analyzing replication next. If I find time I'll also add some CSV exports and pretty rendering facilities with some google charts.

Comment if you have any thoughts, want to see more tests or share some religious propaganda related to your key-value storage system of choice : ).

-- Felix Geisendörfer aka the_undefined

PS: A few questions I can already see coming up:

Why does compact always take at least 1 second?
Because I use a while() loop with sleep(1) to determine when compact is done. I could check more frequently, but its not really a variable I'm interested in.

Why does compact increase the file size for < 50000 documents?
Good question, I have no idea. Anybody?

What insert method is used?
Have a look at the benchmark source. Basically it's bulk-inserts of 1000 items at a time with pre-generated UUIDs.

I ran the benchmark and got some additional output
Yeah, I removed some debugging / activity indicators to make the results more readable for this article.

 
8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16