Tracking retail traffic – Data visualization and Computer Vision

June 2nd, 2011

Looking for data visualization software?

Data visualization is a major part of our development platform. Business decisions will be taken to a new level. Using computer systems that can build proper data visualizations are used to aid in decision making.

Example of tracking is shown earlier with tracking mouse movements on a web page. This is key to making decisions on on placement of items in an internet store front. This example is easily extended into a real, physical store front as well.

Using computer vision technology, people are easily tracked through video, for example surveillance, and capturing position data. Later this data can be visualized, and used to determine best product placements.

Contact us for more information

SEO SERP Heatmap Development using JQuery and PHP/MYSQL

May 31st, 2011

Seo Development – Maximizing your page layout potential

A heatmap done for a site sheet music tabs

One way to maximize ad revenue, and user experience is through trial and error

Our approach, use a custom heat map that can track users’ mouse movements and clicks. This will give us a better idea on how to monetize our site better, sell better, improve user experience and more.

Check out the image above so see an example heat map.

Real Time Data Reporting Dashboard

October 26th, 2010

Developing reporting applications using HTML 5 web standards helps users to have access to data anywhere at any time using a variety of devices. Perfect for managers and CRM type systems, these tools can help make decisions based on real time up to date data.

An example of some HTML 5 reports using the canvas tag developed for our clients:

Example of a report generated with HTML 5 canvas

Example of a report generated with HTML 5 canvas

More information on how to use the canvas tag can be found here:

http://programminglinuxgames.blogspot.com/2010/07/using-canvas-in-firefox-and-safari-for.html

Linux HA – Load Balancing and High Availability

September 21st, 2010

Linux HA – Heartbeat, ldirectord, apache, mysql, debian

Today, the internet is booming. Millions of people are viewing billions of pages. Some of the larger sites require robust hardware setups if the content they are serving is remotely complex, involving things like relational databases and dynamic content. We are going to look at a practical solution that is both cost effective and scalable to handle varying loads. High availability and load balancing is a technique that we can implement with apache and mysql using Heartbeat, a small, open source software package.

There are many ways this package can be set up. What we require is a number of physical servers, each behind a router. Our goals here are to have Fault Tolerance, as well as performance scalability.

Simple Linux HA – Scalable Performance

Our first configuration lacks some high availability features. It has a single point of failure, the load balancing machine Linux HA-1. Other possible points of failures are the router, and the network file storage. These two problems can be solved by looking elsewhere and are not necessarily in the scope of this article.

This is our original layout and has served quite well in many cases. It is simpler to set up and scales well as we add more machines.

The MySQL servers are set up in master-master replication mode. So a write on either of the machines will replicate to the other and vice-versa.

Diagram of Linux HA simple layout

The performance here is achieved by using the ldirectord daemon running on Linux HA-1. It acts as both the apache server and the mysql server. It keeps track of open connections on each machine, and routes the next new connection to the machine (apache or mysql) to the least loaded host (an algorithm chosen in the ldirectord configuration).

An example on how to set this up in a how-to fashion can be found at our blog http://programminglinuxgames.blogspot.com/2010/09/load-balancing-web-servers-using.html

Note this example does not use heartbeat. Heartbeat is a useful tool that will give you the fault tolerance in this set up. Below we will discuss more

Linux HA – Fault Tolerance and Scalable Performance

The next logical step in this is to remove the single point of failure, the Linux HA-1 machine. What we can do is set up heartbeat on 2 machines, and allow one to take over if the other fails at any time.

In order for one to take over for another, we will have both of these machines using a “virtual” network interface. This is just a 2nd IP address, using the virtual interface supported in linux. The circle surrounding Linux HA-1 and Linux HA-2 represents this network interface. You must assign it a separate, new IP address, and have your router route all traffic to that virtual IP address.

Highly Available and Performance Scalable set-up

More Information

http://www.linux-ha.org/wiki/Main_Page – Linux HA
http://www.ultramonkey.org/3/linux-ha.html – Ultramonkey (Heartbeat and Ldirectord)
http://en.wikipedia.org/wiki/Linux-HA – Wikipedia

Check out our blog

September 13th, 2010

About our Blog: More How-To Information

For lots more how-to information, check out my blog at http://programminglinuxgames.blogspot.com/.

Python OpenCL Example OpenCL Language

September 1st, 2010

Python OpenCL Example: pyOpenCL

Example of a program written in the OpenCL programming language.

Below is an example of a program written in OpenCL. This language is written to allow highly parallel processes to be computed on any type of highly parallel processor(s). This is an example from the ‘examples’ folder of pyOpenCL.

About OpenCL

OpenCL stands for Open Computing Language. Its goal is to allow writing of programs across multiple platforms. One of the advantages of OpenCL is its ability to allow programs to be written for the GPU. The GPU is one of the most parallel processors to date and offers large amounts of parallel processing power. You will find OpenCL support in Linux, Windows, and MacOS (Snow Leopard or newer only unfortunately). You will find full support for the GPU using NVidia or ATI video cards and any of the previous operating systems. Also OpenCL is supported for some multi-core or multiple CPU set ups.

Alternatives

Other similar packages are NVidia’s Compute Unified Device Architecture or CUDA and Microsoft’s DirectCompute. None of these are quite as cross-platform as OpenCL.

Overview

We create 2 arrays of random numbers (using the numpy package) then write an OpenCL ‘kernel’ that will sum the 2 arrays, and write to an output buffer the results. We then read that buffer back into python’s memory and print out statistics on that array.

Source Code

import pyopencl as cl
import numpy
import numpy.linalg as la

a = numpy.random.rand(50000).astype(numpy.float32)
b = numpy.random.rand(50000).astype(numpy.float32)

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

mf = cl.mem_flags
a_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)
b_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b)
dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, b.nbytes)

prg = cl.Program(ctx, """
__kernel void sum(__global const float *a,
__global const float *b, __global float *c) {
     int gid = get_global_id(0);
     c[gid] = a[gid] + b[gid];
}
""").build()

prg.sum(queue, a.shape, a_buf, b_buf, dest_buf)

a_plus_b = numpy.empty_like(a)
cl.enqueue_read_buffer(queue, dest_buf, a_plus_b).wait()

print la.norm(a_plus_b - (a+b)), la.norm(a_plus_b)

Links:

PyOpenCL
Python
Boost C++ Libraries (required for PyOpenCL)
More info on OpenCL

Web Crawling Project

April 13th, 2010

Hamiltonian Consulting develops Custom Web Crawling Application

Web Crawling

For the CityDirect network

Task: A system was developed for web data mining. Users were able to input specialized URLs to guide the crawler, and manually rank data sources for their reliability.

Achievements

  • Gathered large amounts of data (100s of GB) in a relational database (PostgreSQL).
  • Multi-process and multi-threaded application written in Python.
  • Due to code reuse and good practices, delivered system in half the time others had quoted.