Doctoral Thesis

User Interfaces Supporting
Casual Data-Centric Interactions on the Web

Doctoral Thesis at MIT EECS / CSAIL

by David F. Huynh

August 2007

Today’s Web is full of structured data, but much of it is transmitted in natural language text or binary images that are not conducive to further machine processing by the time it reaches the user’s web browser. Consequently, casual users—those without programming skills—are limited to whatever features that web sites offer. Encountering a few dozens of addresses of public schools listed in a table on one web site and a few dozens of private schools on another web site, a casual user would have to painstakingly copy and paste each and every address into an online map service, copy and paste the schools’ names, to get a unified view of where the schools are relative to her home. Any more sophisticated operations on data encountered on the Web—such as re-plotting the results of a scientific experiment found online just because the user wants to test a different theory—would be tremendously difficult.

Conversely, to publish structured data to the Web, a casual user settles for static data files or HTML pages that offer none of the features provided by commercial sites such as searching, filtering, maps, timelines, etc., or even as basic a feature as sorting. To offer a rich experience on her site, the casual user must single-handedly build a three-tier web application that normally takes a team of engineers several months.

This thesis explores user interfaces for casual users—those without programming skills—to extract and reuse data from today’s Web as well as publish data into the Web in richly browsable and reusable form. By assuming that casual users most often deal with small and simple data sets, declarative syntaxes and direct manipulation techniques can be supported for tasks previously done only with programming in experts’ tools.

User studies indicated that tools built with such declarative syntaxes and direct manipulation techniques could be used by casual users. Moreover, the data publishing tool built from this research has been used by actual users on the Web for many purposes, from presenting educational materials in classroom to listing products for very small businesses.

kb pages
Full thesis 9,837 134
individual chapters
0. Prologue 160 16
1. Introduction 2,606 16
2. Related Work 318 14
3. Publishing Data 4,752 42
4. Extracting Data 1,886 18
5. Integrating Data 1,928 16
6. Conclusion 251 8
7. Bibliography 134 4