Assemble-A-Site Challenges – Abstraction


One of the largest and most obvious challenges we’re facing while developing the Assemble-A-Site project is the extra level of abstraction. For those of you who do not know what levels of abstraction are, let me explain.

The information that makes up your web page is stored somewhere. In some cases, this will all be in the same place (eg: in an HTML file), while in others it may be in many places (eg: several tables in one or more databases, in files, and in system memory). The levels of abstraction essentially describe how many steps are used to pull all the information together into the HTML code that is sent to the browser, taken from the starting point of the referenced file.

So, the very simplest webpage is recorded as a file of HTML code. This has zero levels of abstraction, because the file called contains all the code already. If the referenced file connects directly to a database to build its content, it has one level of abstraction. As systems get more complex, they start to require more levels of abstraction.

data access Libraries

The code required to connect to a database, collect information, and return it to the program in a useful format can often be quite long. This is especially true if you use proper error checking, and make robust code. Therefore, inserting this code every time you need to connect to the database becomes extremely inefficient if you connect more than two or three times. Instead, you add a level of abstraction to your project, and create a data access library. This library will contain all the methods necessary to connect with the database, retrieve, insert or update data, and return results in a useful format. It does everything in one place, so error management and code improvements can all be done in one place. So now we have two levels of abstraction.

The Black Square CMS uses three, and sometimes four, levels of abstraction. It is modular, so we build another level for the modules to manage their own data. For example, if the project contains a News module, we’ll create a News library. Any page that needs to interact with the News module will call methods of this library to do so. The News library will use the data access library to interact with the database. Three levels of abstraction.

Adding Ajax

When we use Ajax on the website, or in the CMS (which we do quite extensively), we add another level of abstraction. The page requests information from the Ajax layer, which interrogates the module layer, which calls the data access layer, which communicates with the database. Four levels of abstraction.

Each level of abstraction becomes necessary as the scale of the system increases, and it becomes more complex. The levels break down the process of managing all the information into chunks we can get our heads around. But, as the name “abstraction” implies, these levels create a distance between the developer and the working code. The developer who calls the “getNewsItem” method doesn’t know, and probably doesn’t care, what code is being used by that method, or the deeper data access methods, to fetch the information he requires. All he cares is that he gets an information structure that represents his news item. This works perfectly when everything works perfectly, but can cause issues if there are bugs in the deeper levels.

Inefficient coding

As an aside, multiple levels of abstraction also run the risk of inefficient coding. If the developer using a library does not know, in detail, which processes are being used by the library, he may use the methods inefficiently. For example, a method in the data access layer might return a certain piece of information by running a loop through 100 items. A method in the module layer might use this method on 100 of its own items to identify a more definite piece of information. This will result in 10000 iterations to identify the information. This is not serious if it is necessary, but will be very inefficient if not. If the developer using the module layer method knows nothing of this, he might use the method repeatedly in his code, because it is so useful, using millions of iterations, when simply recording the data once and then using that everywhere would be more efficient. This is a very common problem with JQuery, and other javascript libraries.

Applying this to Assemble-a-Site

The challenge with Assemble-a-Site is that we need to make the modules individually configurable. In the Black Square CMS, we custom build each module to the project requirements. This means that each module is pre-defined to work a certain way, and the structure of the information stored about it is known and can be hard-coded into the module layer. This is not true for the assemble-a-site project.

For example, if a Black Square client site needs a product catalogue, we sit with the client to define exactly what constitutes a product, and we build the catalogue to manage that. So, if the product is a book, we will define a title, edition, publication date, ISBN number, price, etc. the module will know that these fields all exist, and we can hard-code the module to use them. With Assemble-a-Site, however, the product catalogue module needs to be configurable for each use. Each site that adds the product catalogue must be able to configure its own set of fields – one site may wish to use it for books, while another might want to sell skin care products. The same code must be used in every case.

To cater for this, we need to add another level of abstraction to the assemble-a-site system to manage the configuration. So, the referenced page calls the module layer, which passed through the configuration layer in order to query the data access layer, which interrogates the database. Four levels of abstraction or, if we use AJAX, five levels.

 

One Comment so far:

  • Superb blog post, I have book marked this internet site so ideally I’ll see much more on this subject in the foreseeable future!