Thursday, August 29, 2013

Create OpenStack Tomcat Cartridge in Apache Stratos

Cartridges

A Cartridge is a package of code or configuration that plugs into Stratos to offer a new PaaS Service. A Cartridge is also a Virtual Machine (VM) image plus configuration and it can operate in two modes that are namely single tenant and multi-tenant.

For more information see Cartridges.

Before create the cartridge you have to make the OpenStack image.

Create OpenStack image to support Tomcat
  • First you have to download ubuntu cloud image and upload it to OpenStack. You can download ubuntu cloud images from here.
  • After that you have to upload that image to OpenStack. Please refer creating OpenStack images for more details.
  • After successfully upload the cloud image to OpenStack you can launch instance from that image.
  • Now you need to setup the instance for Tomcat. To do that you have to install following software to that instance. You can do it by ssh to that instance.
    • java 1.6.x - refer here for more details
    • tomcat7
    • git
    • ruby
    • thrift - refer here for more details
    • xml_grep - use sudo apt-get install xml-twig-tools
  • Installing tomcat7 will install several java versions. But to this installation you need java 1.6.x. To set this as default you can use following commands.
    • sudo update-alternatives --install "/usr/bin/java" "java" "/usr/lib/jvm/jdk1.6.x/jre/bin/java" 1
    • sudo update-alternatives --install "/usr/bin/javac" "javac" "/usr/lib/jvm/jdk1.6.x/bin/javac" 1
    • sudo update-alternatives --install "/usr/bin/javaws" "javaws" "/usr/lib/jvm/jdk1.6.x/jre/bin/javaws" 1
  • After installing above prerequisites, you have to update /opt directory with following things.
    • get-launch-params.rb
    • healthcheck.sh
    • stratos-init.sh
    • cartridge_data_publisher_1.0.2
  • You can get above from apache stratos git repo.
  • After completing this your instance is ready. Now you can create the snapshot from that instance and make the image from that snapshot. Before creating the image you have to update /etc/rc.local file to trigger stratos-init.sh at the boot up time. For more details please refer here to get an idea about how to create OpenStack images.
Create Tomcat Cartridge
Now you are in the final stage of the Tomcat cartridge creation in apache stratos. Simply you can refer here to install apache stratos. After successful installation you can edit the tomcat.xml according to your IaaS image properties. You can download sample tomcat.xml file from here.

After doing that you can start Apache Stratos Controller and subscribe to the created Tomcat cartridge.

Wednesday, August 21, 2013

Install Apache Stratos

Apache Stratos could be installed on a single node or on multiple nodes. If you are installing on multiple nodes copy stratos_installer to each node and update configuration parameters in conf/setup.conf accordingly.

Installation Guide
1. Install following prerequisites. Use sudo apt-get install for this.
  • java    -jdk1.6.x
  • git
  • facter
  • zip
  • mysql-server
  • Gitblits 

2. Build Apache Stratos from source:
  • git clone https://git-wip-us.apache.org/repos/asf/incubator-stratos.git
  • cd incubator-stratos
  • mvn clean install

3. Copy cloud controller, stratos controller, elb, agent and cli packages to a folder inside stratos_installer. May be this could be called "stratos_installer/packages":
  • apache-stratos-cc-<version>.zip 
  • apache-stratos-sc-<version>.zip
  • apache-stratos-elb-<version>.zip   
  • apache-stratos-agent-<version>.zip
  • apache-stratos-cli-<version>.zip 
4. Download WSO2 Message Broker binary distribution from http://wso2.com and copy it to stratos-pack-path. Here you could use any preferred message broker product which supports AMPQ.

5. Extract WSO2 Message Broker distribution to desired path (this will be identified as stratos-path) and set it's port offset in repository/conf/carbon.xml to 5. This will set the actual port to 5677.

6. Download MySql Java connector from http://dev.mysql.com/downloads and copy the jar file to the above packages folder.

7. Create and download the keys from IaaSs and store them on a secure location.

8. If Apache Stratos being setup in multiple nodes open up the security rules in IaaSs for the following ports (defined in ./conf/setup.conf) 22, 443, 8280, 4103, 4100, agent_https_port, agent_http_port, elb_port, agent_clustering_port, sc_cluster_port, elb_cluster_port, cassandra_port, stratos_db_port and userstore_db_port.

9. Either download pre-built cartridge images from Apache Stratos website or create your own cartridges. Please refer Apache Stratos documentation for more information on creating cartridge images. For Amazon EC2, you could find pre-built PHP, MySQL and Tomcat cartridges published in Amazon EC2 AMI image repository.

10. If all stratos products are installed on the same node, update /etc/hosts file with a set of domain/host names mapping each product. These values should be updated on conf/setup.conf file. If they are installed on different nodes use actual hostnames.

    <ip-address> stratos.apache.org        # stratos domain
    <ip-address> mb.stratos.apache.org     # message broker hostname
    <ip-address> cc.stratos.apache.org     # cloud controller hostname
    <ip-address> sc.stratos.apache.org     # stratos controller hostname
    <ip-address> elb.stratos.apache.org    # elastic load balancer hostname
    <ip-address> agent.stratos.apache.org  # agent hostname

11. Update ./conf/setup.conf and configure parameters. 

12. Run setup.sh as root to install.

    sudo ./setup.sh -p "<product-list>"
    <product-list> could be defined as "cc sc elb agent" or any other combination according to the
    deployment configuration.
    Example:
    sudo ./setup.sh -p "all"

13. If you need to clean the setup run bellow command:
      sudo ./clean.sh -u <mysql-username> -p <mysql-password>


Friday, August 16, 2013

Review - A New Design for Distributed Systems : The Remote Memory Model

What problem does the paper attack? How does it relate to and improve upon previous work in its domain?

Existing distributed systems relay on clients which has its own memory used as their virtual memory. This paper discussed about, the conventional virtual memory systems in distributed environment and the limitations of them. The paper addresses the solution for those limitations and they introduced a new design called “The remote memory model”. Also it describes the impact of remote memory model architecture for design of distributed systems.  

In existing virtual memory system architecture, each program has a large, linear address space in which it places code and data. Most conventional virtual memory systems use demand paging technique to retrieve data from secondary storage in on demand manner when system needs to access data. Existing virtual memory architecture cannot be used in distributed systems which contain diskless clients. So this paper addresses “The remote memory model” as a solution for this problem in distributed systems. 

What are the key contributions of the paper?

The main contribution of this paper is to designing remote memory model for distributed systems. Basically the remote memory model can be divided into three parts. 

  • Several client machines 
  • Various server machines, one of more dedicated machines called remote memory servers, 
  • A communication channel

In this model client machines used communication channel to access memory on the remote memory server. This model supports for heterogeneous client machines. It keep this model as general as possible. To enhance this new model from conventional systems, authors consider about following properties as well. 

  • Additional memory 
  • Arbitrarily large storage capacity 
  • Data sharing 
  • Offloading file server
  • Remote memory semantics

Communication protocol is another contribution of this paper which support for remote memory model. This protocol provides reliability, architecture independence, and efficiency along with remote memory server. Also they introduced two communication protocol layers called Xinu Paging Protocol (XPP) layer, and the Negative Acknowledgement Fragmentation Protocol (NAFP) layer.

Comments

The name of this paper is “A New Design for Distributed Systems: The Remote Memory Model”. Reading the title of this paper, reader gets an idea that this paper relevant to distributed system category. But the content of this paper did not talk anything about distributed systems. Authors did not mention any literature background about distributed systems. They entirely talked about their design model.  
  
Authors did not clearly specify the virtual memory problem in distributed systems. They just mentioned there was a problem and the remote memory model is the solution for that. But they did not comparer and prove their remote memory model is suitable for existing problems in distributed systems. They only experimented there prototype with several client machines and different kind of operating systems and showed the results. But they did not compare those results with existing systems. Also they did not talked about the communication latency with the remote memory server, because existing systems did not have such latency. Future works of this remote memory model also not mention in this paper.

References
[1] D. Comer and J. Griffioen, “A new design for distributed systems: The remote memory model,” in Proceedings of the USENIX Summer Conference, pp. 127–135, Citeseer, 1990

Review - SwissQM: Next Generation Data Processing in Sensor Networks

What problem does the paper attack? How does it relate to and improve upon previous work in its domain?

Existing systems in Data processing in sensor networks are limited in two fundamental ways. Those limitations are lack of data independent and poor integration with the higher layers of the data processing chain. In this paper they presented new technique for data processing in sensor networks called SwissQM. It support adaptability, multiple user and applications, high quality turn-around, extensibility and optimizes used of support when comparing to other data processing techniques in sensor networks.

SwissQM borrowed many ideas mainly from TinyDB. TinyDB is the declarative database abstraction layer under TinyOS and it provides in network query processing and aggregation. But comparing SwissQM along with TinyDB, SwissQM is intended as the next generation query processing in sensor networks. It provides richer and more flexible functionality at the senor network level and powerful adaptable layer to the outside world. Also it provides data independent, query language independent and optimized perform in wide range when comparing with existing systems such as TinyDB, Giotto  a runtime environment for embedded systems. Rather than other systems, SwissQM is based on a special machine that runs optimized byte code rather than queries. Also it provides generic high-level declarative programming model and impose no data model. Comparing SwissQM with other data processing techniques, SwissQM used new technique to query processing. SwissQM translate the query for byte code. Therefore the rage of expressions supported by SwissQM is not limited. Also it supports multi-query on two layers. First it merges different user queries into virtual query and then it performs multi-programming in the query machine. 

What are the key contributions of the paper?

Going through this paper we can figure out SwissQM’s design considerations has become the key contributions of this paper. SwissQM has been designed to fulfill several key requirements which had not fulfilled by other data processing techniques in sensor networks. Those are 

  • Separation of sensors and external interface
  • Dynamic, multi-user, multi-programming environment 
  • Optimized use of sensors   
  • Extensibility

These novel design principles are helpful to SwissQM become the next generation data processing in sensor networks. They categorize SwissQM sensor network for four categories and shown the performances of this new design approach. SwissQM sensor networks consist with the sensor network gateway, the query machine, query machine byte code and query programs. 

Comments

Separation between gateway and sensor nodes and the implementation of a virtual machine at the sensor nodes, rather than the query processor are the main key design decisions in SwissQM. These two key designs gives the Turing-completeness, independent of query language used, independent of the user model and the extensibility of SwissQM. 

Rather than presenting the features, this paper mentioned few examples of SwissQM queries and explained those queries preciously. They used well known compiler technique called “templating” to translate virtual queries to SwissQM queries. Also they clearly illustrate init, delivery and reception which are the main three QM program sections by using different examples. 
Memory is the utmost crucial factor when considering the sensor networks. SwissQM has fifty nine instructions set and they showed the complete instruction set need 33kB flash memory and 3kB SRAM memory. This is a big advantage when comparing with TinyDB because it takes 65kB of flash memory and 3kB of SRAM memory. 

Message size also becomes the crucial factor in ad-hoc sensor networks. Different radio platforms provide different massage sizes. SwissQM has addressed the solution for this problem. In SwissQM they were opted message size for 36 bytes. But TinyDB uses 49 bytes. This is big step in ad-hoc sensor networking data processing. 

SwissQM has used three-tier architecture for visualizing the sensor network and permit multi-query optimization for more efficient use of sensor networks. It used query merging, sub expression matching and window processing optimization for multi-query optimization. By using this three-tier architecture the optimized virtual queries are transformed into network queries and those networks queries can be easily understood by the sensor nodes.

References
[1] R. Mueller, G. Alonso and D. Kossmann, "SwissQM: Next generation data processing in sensor networks," in Third Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 7-10, 2007

Abstract - NoSQL Query Processing System for Wireless ad-hoc and Sensor Networks By Manula Thantriwatte & Chamath Keppetiyagama

Wireless Sensor Networks (WSN) are distributed systems typically composed of embedded devices, each equipped with a processing unit, a wireless communication interface, as well as sensors and/or actuators. Many applications have been implemented on sensor networks demonstrating the versatility of this technology, and some are already finding their way into the mainstream. Often, sensor nodes are tiny battery powered devices. Usually, these devices are capable of forming ad hoc wireless networks and communicate with each other.

Two main abstractions are available for developing applications for WSNs; message passing and SQL query interfaces. First exposes the network to the programmers and the second hides the complexity of the network with a database abstraction.  There are two popular query processing systems for wireless sensor network, namely TinyDB (for TinyOS) and TikiriDB (for Contiki). These query processing systems represent the sensors on the sensor network as a table. Users can insert queries at the base station, and it converts those queries into sensor node understandable format, and they are sent to sensor network to get the results.

Database abstractions currently used in WSN are based on Relational Database Management Systems (RDBMS). These relational models use ACID properties (atomicity, consistency, isolation, durability). TinyDB and TikiriDB use SQL queries.  However, a relational database model that guarantees ACID properties is not a good match for a wireless sensor network since consistent connectivity or the uninterrupted operation of the sensor nodes cannot be expected.

We noted that the NoSQL approach which does not rely on the ACID properties is a better match for a query processing systems for WASNs.  We developed a NoSQL based database abstraction on the Contiki operating system that is popular among the WSN community.

We designed NoSQL query syntaxes for querying sensor networks that are similar to the RedisDB NoSQL queries and we also adopted the RedisDB architecture for our implementation. We implemented the following NoSQL queries on Contiki.

  • Select Query: Appropriate keyword followed by relevant key
  • Join Query: Appropriate keyword followed by relevant key followed by valid set condition for key
  • Range Query: Appropriate keyword followed by relevant key followed by valid range condition
  • Ranking Data: Appropriate keyword followed by key and relevant member name
  • Get the key of members: Appropriate keyword followed by relevant key

We have implemented the Redis back-end in iterative manner. RedisDB supports different data structures such as Strings, Hashes, Lists, Sets and Sorted-sets. We prototyped the back-end using each of the above mentioned data structures and evaluated the performance.  Based on the experience gained through these prototypes, we decided to use Sorted-sets for the final implementation.

Sorted-set implementation of our NoSQL database abstraction works with two values called the key and the member. In this implementation we mapped sensing field of a query to a key and sensing values to members. According to these key and member values we can use following NoSQL queries in our database abstraction.

  • ZADD key score member: Add a member to a Sorted-set, or update its score if it already exists.
  • ZCARD key: Get the number of members in a Sorted-set.
  • ZCOUNT key min max: Count the members in a Sorted-set with scores within the given values.
  • ZINCRBY key increment member: Increment the score of a member in a Sorted-set. 
  • ZINTERSTORE destination numkeys key [key ...]: Intersect multiple Sorted-sets and store the resulting Sorted-set in a new key.

In performance analysis we analyzed the query execution time of TikiriDB database abstraction and our newly designed NoSQL database abstraction. We collected the execution times of queries by changing the sample period of both SQL and NoSQL queries. Then we observed that for a shorter time periods, both database abstractions show the same performances, but when the time period increases, performances of NoSQL database abstraction get better. The reason for that performance bottleneck in SQL database abstraction is the processing time of SQL queries. According to above results we can conclude that processing NoSQL queries are much more efficient than processing SQL queries in sensor networks. 

Also we measured the power required to process the NoSQL queries as well as the SQL queries. Our implementation consumes less power than the SQL based query processing system on Contiki. This is important since sensor nodes are usually energy constrained and an efficient query processing system helps to conserve energy.

References

Thursday, August 15, 2013

Apache Stratos

Introduction
Apache Stratos (Incubating) is a polyglot PaaS framework, providing developers a cloud-based environment for developing, testing, and running scalable applications, and IT providers high utilization rates, automated resource management, and platform-wide insight including monitoring and billing. [1]

Apache Stratos helps run Tomcat, PHP and MySQL apps as a service on all the major cloud infrastructures. It brings self-service management, elastic scaling, multi-tenant deployment, usage monitoring as well as further capabilities. In addition, it brings the ability to take any server software and make it into a Stratos Cartridge. This means any server can be run 'as-a-Service' alongside the other app containers. [1]

Layered Architecture
According to the diagram middle layer is the stratos foundation which consists of Elastic Load Balancer, Cloud Controller, Stratos Controller and CLI / WebUI Tool. On top of that there are some services such as Logging Service, Registry Service, Messaging Service and Billing Service. We can configure those services with the top layer. After you configure the foundation layer with this services you can use these services in the run time.

Cartridges
A Cartridge is a package of code or configuration that plugs into Stratos to offer a new PaaS Service. A Cartridge is also a Virtual Machine (VM) image plus configuration and it can operate in two modes that are namely single tenant and multi-tenant.

For more information see Cartridges.

References