close

Many Web-based applications include the requirement to display some part of the user search results in chunks of pages. For example, think of the typical search engine, say Google, which displays search results in chunks of 10 or more items per page.

In this blog, I will describe how to implement this design requirement using features provided by DB2 ® Universal Database TM (UDB) and how to cache the results using a stateful bean or session. Though this example is specific to DB2 and JSP, the same approach can be used with any relational database and dynamic pages technique.

We will consider two approaches:

* Querying the database to get records in chunks using the rownext() feature of DB2
* Caching results in stateful session beans or HttpSession

The scenario

We shall consider these approaches by looking at a scenario for a product-based Web site which allows the user to search for products offered based on category, price, brand, and so on. The results will be displayed in chunks of user-defined page sizes, say 10 or 20 items.

Here are our sample table definitions:

CREATE TABLE Product_Category(
category_id integer not null,
category_name varchar(200), not null
category_decription varchar2(500)
)

CREATE TABLE Product (
product_id Integer not null,
product_name varchar2(50) not null,
prod_category_id integer not null,
product_decription varchar2(100) not null,
product_price decimal(15,2),
product_status char(1) default 'Y',
product_width decimal(5,2) not null,
product_length decimal(5,2) not null,
product_created_date  timestamp,
product_netweight  decimal(10,3),
constraint PK_PRODUCT_ID PRIMARY KEY
(product_id),
constraint product_category_fk foreign key
(prod_category_id)
references Product_Category (category_id))

Using DB2 features to achieve pagination

This approach is good for applications where you have a chance to do some design work before the database and application go into production.

Here we assume that the user wants the products to be displayed in chunk of 20 results per page. As we know, the rows in a relational table have no specific order, so the SQL query needs to be designed to allow the data to be retrieved in a particular order using an ORDER BY clause on the primary key, which in our case is our product id.

DB2 allows you to order a result set on the fly and fetch an arbitrary number of rows from the beginning or end of this result set.

Here is a query to select first 20 records from our product table based on the category “Books.”

SELECT PRODUCT_ID, PRODUCT_NAME, PRODUCT_DESCRIPTION,
PRODUCT_PRICE, PROD_CATEGORY_ID
FROM PRODUCT,PRODUCT_CATEGORY
WHERE
PRODUCT.PROD_CATEGORY_ID
= PRODUCT_CATEGORY.CATEGORY_ID
AND
PRODUCT_CATEGORY.CATEGORY_ID = 'Books'
ORDER BY PRODUCT.PRODUCT_ID
FETCH FIRST 20 ROWS ONLY

The ORDER BY will force a sort of the entire result set in memory, so we’re not doing this to improve DB2 server performance (although sending only 10 rows to the client may improve network performance).

If the order is not a factor and we just want to get the first 20 rows based on the user criteria, then we can eliminate the ORDER BY to save the sort on the DB2 server:

SELECT PRODUCT_ID, PRODUCT_NAME, PRODUCT_DESCRIPTION,
PRODUCT_PRICE, PROD_CATEGORY_ID
FROM PRODUCT,PRODUCT_CATEGORY
WHERE
PRODUCT. PROD_CATEGORY_ID
= PRODUCT_CATEGORY.CATEGORY_ID
AND
PRODUCT_CATEGORY.CATEGORY_ID = 'Books'
FETCH FIRST 20 ROWS ONLY

Since the application needs to display say 20 results per page, we can make use of the row_next() feature provided by DB2.

Below is an example of how to select results from rows numbered 21 to 40. We assume that the user has selected category as “Books” and product description as “Application Servers”.

SELECT * FROM (
SELECT PRODUCT_ID, PRODUCT_NAME,
PRODUCT_DESCRIPTION, PRODUCT_PRICE,
rownumber() OVER
(ORDER BY PRODUCT_ID)
AS ROW_NEXT
FROM PRODUCT,PRODUCT_CATEGORY WHERE
PRODUCT.PROD_CATEGORY_ID
= PRODUCT_CATEGORY.CATEGORY_ID
AND
PRODUCT_CATEGORY.CATEGORY_ID = 'Books'
AND
PRODUCT.PRODUCT_DESCRIPTION LIKE
'Application Servers'
)
AS PRODUCT_TEMP WHERE
ROW_NEXT BETWEEN 21 and 40

In the case of a Web-based application, the parameters “21” and “40” can be specified at run time. Here is the modified query:

SELECT * FROM (
SELECT PRODUCT_ID, PRODUCT_NAME,
PRODUCT_DESCRIPTION, PRODUCT_PRICE,
rownumber() OVER
(ORDER BY PRODUCT_ID) AS ROW_NEXT
FROM PRODUCT,PRODUCT_CATEGORY
WHERE
PRODUCT. PROD_CATEGORY_ID
= PRODUCT_CATEGORY.CATEGORY_ID
AND
PRODUCT_CATEGORY.CATEGORY_ID = 'Books'
AND
PRODUCT. PRODUCT_DESCRIPTION LIKE
'Application Servers'
)
AS PRODUCT_TEMP WHERE
ROW_NEXT BETWEEN ? and ?

The rownumber() function allows the developers to assign row numbers to results sets on the fly.

If the row_next clause is removed ( ROW_NEXT BETWEEN ? and ? ), all the rows that matched the criteria will be returned.

The SELECT * FROM clause in case above can be thought of as a temporary table holding the entire result set that matches the criteria and than returning that set of results specified in the row range.

Using the rownumber() feature will cause an additional performance impact on the system as the database first fetches all rows that matches the criteria and returns only those rows specified in the range.

Next we look at the trade-offs and advantages of this approach and when it’s an appropriate choice.

Trade-offs for this approach

  • Transactions and performance impact

In this approach, the transactions need to be handled programmatically by the developer. In our scenario, if we assume that products are updated only once daily, say at midnight, we can use the lowest transaction isolation level of READ_UNCOMITTED.

Using the row_next feature definitely will have some performance impact on the database application as compared to a query without the row_next feature. Also this approach requires us to hit the database each time to get next sequence of result sets based on the range of rows.

  • Code portability

Fetching an arbitrary number of rows from the beginning or end of this result set is provided by almost every RDBMS system, though the SQL syntax varies. Hence if we move from one database vendor to other, we might have to modify the query accordingly.

  • Providing seamless back button functionality

If our design requires browser back button functionality and the user is interested in viewing the latest information from the database each time, then this design can provide a possible solution. The drawback is that this may require a hit on the database for each results page, which might affect the performance drastically.

Basically the URL query will have the search parameters embedded which can be fed to our SQL Query. Here is a possible URL query:
http://localhost:8050/ProductSearch?
minprize=50& maxprice=100& pagecount=20& resultsPerpage=10

When the user clicks on browser back button, the previous URL to be displayed in browser will be:

http://localhost:8050/ProductSearch?
minprize=50&maxprice=100&pagecount=10&resultsPerpage=10

We’ll assume that the latest page id is available in the session. So the browser back button can be tracked as follows:

if(!session.getAttribute("pageID).equalsIgnoreCase
(request.getAttribute("pageCount")){
//Then user has clicked the back button
//Get possible search parameters, pagecount
//and resultsPerPage from URL and formulate
//a dynamic URL and post the information back
//to the servlet.The URL would be
http://localhost:8050/ProductSearch?minprize=50&
maxprice=100&pagecount=10&resultsPerpage=10</p>
return;
//Done display the results screen page
}

Using this approach the user will be able to view the latest information from the database.

Next we move on to our second approach, which is widely used in many applications.

Caching result sets in stateful beans or HttpSession

This is one of the widely used approaches to pagination in which results are cached in some stateful object like session beans or HttpSession. The choice between HttpSession or stateful bean depends upon the size of the data to be retrieved. If there is large amount of data one should definitely go for the stateful session bean method.

In this approach we would get all the results from the database in one go and store the results in our stateful session bean.

Here is a possible query to retrieve all the books that are available in our product table:

[

SELECT PRODUCT_ID, PRODUCT_NAME, PRODUCT_DESCRIPTION,
PRODUCT_PRICE, PROD_CATEGORY_ID
FROM PRODUCT
WHERE
PRODUCT. PROD_CATEGORY_ID
= PRODUCT_CATEGORY.CATEGORY_ID
AND
PRODUCT_CATEGORY.CATEGORY_ID = 'Books'

Here is our design of our remote interface stateful bean:

public void setSearchResults(Category category)
throws RemoteException

public Category getPerPagSearchResults(String page_count)
throws RemoteException

Here is what typical Category and Product objects would look like:

public class Category extends java.io.Serializable {
private String category_id,
private String category_name,
//Holds an Array List of products under that category.
private ArrayList products;
}

public class Product extends java.io.Serializable {
private String product_id;
private String product_name;
private String product_desc;
private String product_price;
}

Here are the steps required to achieve pagination:

1. Get the result set based on the search criteria.
2. Loop though the result set and store each product object in our product ArrayList.
3. Store the array list of project objects in the category object.
4. Call our remote method setSearchResults and store the category object.
5. In order to get results to be displayed per page, call the remote method getPerPagSearchResults() passing the page_count (such as 10 or 20) to retrieve the corresponding records.

Here is our typical implementation of getPerPagSearchResults(String page_count):

//Check if Product List contains at least page_count products
int len = category.getProduct_list().length>page_count
? page_count : category.getProduct_list().length

ArrayList product_tempList  = new ArrayList();

for( int i = page_count+1 ; i < len ; i++){
//Loop though the product list array and get
//corresponding records
product_tempList.add(category.getProduct_list().get(I);
}

//return category object containing an arraylist of
//products containg records from page_count
//to page_count+10 or product_list.length if
//product_list.length is less than page_count+10

return product_tempList

The product list can be iterated and all the product information can be displayed at that point. Whenever the next button is clicked the next page_count can be passed to our servlet, which does the same process listed above.

Trade-offs for this approach

  • Transaction and performance impact

In this approach, fetching the result set based on the search criteria needs to be handled by the developer, while maintaining the state of our category object to implement caching is handled by the EJB Container. This provides an additional layer for managing our client state.

This approach will definitely provide better performance than our earlier one, as we will avoid subsequent database calls and will have caching of result sets.

Since we are caching the results, it is possible that the user might be viewing stale data. Because of this issue, this approach is well-suited for an application where the data is not constantly updated. For example if you are updating your products list every night at midnight, then the viewer will likely not be viewing stale data when using the application throughout the day.

  • Code portability

This approach provides a vendor-neutral approach as our caching logic can be moved from one EJB container to another without changing our code. However we might have to modify our EJB vendor-specfic deployment descriptor if the code needs to be run on different EJB containers.

  • Providing back button functionality using this approach

Consider an example where the browser cache is disabled and the user clicks on the browser back button. That would result in the browser requesting the JSP page again. Now suppose we navigate across different search criteria pages. As per this design only the latest search results will be displayed, as we are keeping our latest search results in the Stateful bean. So in order to track which search criteria the user is working on, we can store the search criteria in the session and compare it with request parameters present in the URL when user clicks on the browser back button.

Here is sample code that is present in the JSP which displays Search Results page:

SearchCriteria searchObjSession = (SearchCriteria)
session.getAttribute(“SearchCriteria”)

Next we would get search parameters from the URL .Here we assume that URL is:

http://localhost:8050/ProductSearch?minprize=50&maxprice=100
&pagecount=10&resultsPerpage=10

Here is the sample code:

SearchCriteria  previousSearch = new SearchCriteria();
If(request.getAttribute("minPrice") != null){
previousSearch.setMinPrice(request.getAttribute("minPrice"))</p>
//and so on for remaining parameters
//Next we would compare our SearchCriteria object from session with our PreviousSearch object:
if(!searchObjSession.equals(previousSearch)){
//User is trying to view previous search results
//Show an error to user that he is allowed to work on latest
//search criteria only or else
//depending on your business logic you can give a call to
//database with previous search
//criteria and store it in our Stateful bean and display the
//results back to the user.
//Redirect him to Search Criteria Page or Search Results Page
// return;
}

Which approach is best for me?

Based on your requirements you can apply a combination of both approaches. The most widely used approach is that which provides caching of result sets.

Consider a scenario where we want to display only the latest 20 products that have been added. The first approach definitely can be used since in this case sorting of the result sets and providing only 20 records would be a job for the database system. This scenario is very useful if your database is located remotely as this will ensure only 20 records flow over the network layer.

Conclusion

We’ve talked about how to achieve pagination using two possible approaches. Depending upon your application and business needs, one of these approaches can be used to solve the pagination problem for your Web applications.

This article of mine was first published by IBM DeveloperWorks at -http://www.ibm.com/developerworks/data/library/techarticle/0307balani/0307balani.html.All rights retained by IBM and the author.

Tags : ArchitecturePagination
Naveen

The author Naveen