In case you were skeptical, here's a case that happens to be my case.

 Case starts to look like a funny word when you use it too much, doesn't it?  

 

Case Study #1

Premier Properties, LTD.
Industry: Real Estate
URL: https://premierpropertiesltd.com

Background: Premier Properties was interested in creating a website that provided all of the features consumers were looking for in a realty website.  Consumers would use the site to search for residential properties, commercial properties, and land.  Additionally, the planned website was intended to introduce consumers to the Premier Properties brand.  To meet this goal, the website was designed to inform consumers about Premier Properties’ current properties and current open houses, their agents, preferred service providers, and the history of the company.

 



Explanation of Overall Problem:


In this project, there were specific types of properties that were being managed through 8 separate data sets.  One of these data sets was combined for end output.  (Hence why there are only seven data sets below).  Each data set had an extensive number of columns. 
Any time a project includes managing a large amount of data, there are various challenges that a developer or programmer faces.  Planning decisions about the structure of data and the method used to access that data often make or break the success of a project in these cases. 

 

  • Single Family data set (141 columns)
  • Condo data set  (137 columns)
  • Two-Family data set (128 columns)
  • Multi-Family data set  (82 columns)
  • Land data set  (74 columns)
  • Business Opportunity & Commercial data set (135 combined columns)


The primary challenges faced on this site could all be summarized as speed problems.  Whether talking about record access speed or search speed, it was always a question of speed.  This is a problem that is commonly left inadequately addressed on many MLS-based real estate websites. 

I tried hard to make this site a different story.



Problem #1 : Record Access Speed & Search Speed

Solution : Convert the data files to local MySQL database table

To solve the problem of record access speed, the solution developed was to download the data into a local MySQL database table.

Below is an explanation of how that happened.

First of all, it's important to note that each property type requires a separate query to the MLS server to pull the data down from the remote location.  Once the connection to the server is established through the phRETS Class, data is gathered from the server and put into memory using functions built-in to the phRETS Class.  At this point, the data is in memory and PHP is used to convert the data into a CSV string delimited using 3 uniquely special characters. 

The delimited CSV string file is saved to the local server for later use and the next property type’s data is downloaded.  Once all of the property types have their data downloaded to the local server, a query is run to import data to each respective database table.

This process happens on a daily basis during low level usage times.  The process is automated through the use of a CRON job and proper permissions on the file.  The process typically takes anywhere from 5-10 minutes to complete and typical record volume is in the neighborhood of 5 to 10 thousand records.

The conversion of the data into a MySQL table largely solves half of the problem of fast record access for end users.

By downloading the remote data into a CSV file and storing it to a local database, the need for external connections and queries is avoided when speed actually matters. 

Beyond the local database file, the queries are also optimized to only pull in the columns necessary.  A local PHP configuration file stores all of the necessary columns for each property type table query.  This gives the system flexibility and modularity for the future.  When new fields are added to the MLS service, the configuration file can be edited without even touching the propertyClass (a class I developed to perform operations related to the search and display of properties on the website). 

It’s important to note that each table had a search index added as well.  The search index is another key to fast record retrieval.


 

Problem #2 : Photo records require a separate, external query

Solution : Pre-process the data 


Remote Photo Pull Down

The fact that remote photo locations are not listed in any of the property type columns was disconcerting.  It initially seemed that a separate - external - photo query was necessary for each record when searches were performed.  I identified this as a serious performance problem early on.

To arrive at a viable solution, I looked at what made the rest of the system work: local files.  The solution that I developed to prevent this problem was to prepare the image paths ahead of consumer consumption.  In other words, since I could not get around external querying for the photos of each MLS record, I decided to change when this processing happened.  That is, I decided to pre-process all of this data at a time when it did not matter to the end user. 

Below is the process.

 Steps to retrieve remote photo locations:

  1. Perform photo queries for all of the MLS records immediately following each daily CSV data pull down
  2. For each photo query, the photo location is recorded to a local database table for use in end-user search queries
  3. Each photo can now be joined through a local database join instead of requiring a query external to the property website


Conclusion

This project really allowed me to experience the trials and triumphs of a large data set.  Although some of the problems initially seemed to be quite daunting, sticking to basic principles and logically stepping through processes helped me to arrive at a reasonable solution.  The end result is a website that continues to perform very fast searches thanks to indexed tables and well-planned local queries.  

I can't really take credit for the principles employed on this site, however.  Many of the same ideas I employed here have been used on the web, and in networking, for a long time.  

DNS is a perfect example.  Although DNS changes all the time, there are cached copies of DNS on several different levels so servers and hosts aren't constantly hitting root servers to ask them about something they already knew.

Similarly, on this website, a cached copy, or snapshot, of the entire data set is essentially stored on a sever local to where the data is being fetched.  In a nutshell, having all of this data ready to be accessed locally was the ultimate key to speed. 

 


 

Full Disclaimer: The project example below was developed while employed as a web developer for Ocreative Design Studio.  The purpose of this case study is to outline my role in the project, explain the challenges I faced during the project - as well as the solutions I developed to meet those challenges. 

As the lead (and sole) web developer on this project, I make a claim of authorship (but not ownership) to a majority portion of this project’s code.  Any and all designs are property of Ocreative Design Studio, LLC, and I make no claim of authorship or ownership to any design element on the Premier Properties, LTD website. 

Images and graphics displayed below are here solely for the purpose of a visual representation of the website.