Wednesday, February 13, 2019

Introduction Python Pandas

Pandas is an open-source Python Library providing high-performance data manipulation and analysis tool using its powerful data structures. The name Pandas is derived from the word Panel Data – an Econometrics from Multidimensional data.

In 2008, developer Wes McKinney started developing pandas when in need of high performance, flexible tool for analysis of data.

Prior to Pandas, Python was majorly used for data munging and preparation. It had very little contribution towards data analysis. Pandas solved this problem. Using Pandas, we can accomplish five typical steps in the processing and analysis of data, regardless of the origin of data — load, prepare, manipulate, model, and analyze.

Python with Pandas is used in a wide range of fields including academic and commercial domains including finance, economics, Statistics, analytics, etc.

Key Features of Pandas
Fast and efficient DataFrame object with default and customized indexing.
Tools for loading data into in-memory data objects from different file formats.
Data alignment and integrated handling of missing data.
Reshaping and pivoting of date sets.
Label-based slicing, indexing and subsetting of large data sets.
Columns from a data structure can be deleted or inserted.
Group by data for aggregation and transformations.
High performance merging and joining of data.
Time Series functionality.
Python Pandas - Environment Setup
Standard Python distribution doesn't come bundled with Pandas module. A lightweight alternative is to install NumPy using popular Python package installer, pip.

pip install pandas
If you install Anaconda Python package, Pandas will be installed by default with the following −

Windows
Anaconda (from https://www.continuum.io) is a free Python distribution for SciPy stack. It is also available for Linux and Mac.

Canopy (https://www.enthought.com/products/canopy/) is available as free as well as commercial distribution with full SciPy stack for Windows, Linux and Mac.

Python (x,y) is a free Python distribution with SciPy stack and Spyder IDE for Windows OS. (Downloadable from http://python-xy.github.io/)

Linux
Package managers of respective Linux distributions are used to install one or more packages in SciPy stack.

For Ubuntu Users

sudo apt-get install python-numpy python-scipy python-matplotlibipythonipythonnotebook
python-pandas python-sympy python-nose
For Fedora Users

sudo yum install numpyscipy python-matplotlibipython python-pandas sympy
python-nose atlas-devel

Introduction To Python Data Science

Data science is the process of deriving knowledge and insights from a huge and diverse set of data through organizing, processing and analysing the data. It involves many different disciplines like mathematical and statistical modelling, extracting data from it source and applying data visualization techniques. Often it also involves handling big data technologies to gather both structured and unstructured data. Below we will see some example scenarios where Data science is used.

Recommendation systems
As online shopping becomes more prevalent, the e-commerce platforms are able to capture users shopping preferences as well as the performance of various products in the market. This leads to creation of recommendation systems which create models predicting the shoppers needs and show the products the shopper is most likely to buy.

Financial Risk management
The financial risk involving loans and credits are better analysed by using the customers past spend habits, past defaults, other financial commitments and many socio-economic indicators. These data is gathered from various sources in different formats. Organising them together and getting insight into customers profile needs the help of Data science. The outcome is minimizing loss for the financial organization by avoiding bad debt.

Improvement in Health Care services
The health care industry deals with a variety of data which can be classified into technical data, financial data, patient information, drug information and legal rules. All this data need to be analysed in a coordinated manner to produce insights that will save cost both for the health care provider and care receiver while remaining legally compliant.

Computer Vision
The advancement in recognizing an image by a computer involves processing large sets of image data from multiple objects of same category. For example, Face recognition. These data sets are modelled, and algorithms are created to apply the model to newer images to get a satisfactory result. Processing of these huge data sets and creation of models need various tools used in Data science.

Efficient Management of Energy
As the demand for energy consumption soars, the energy producing companies need to manage the various phases of the energy production and distribution more efficiently. This involves optimizing the production methods, the storage and distribution mechanisms as well as studying the customers consumption patterns. Linking the data from all these sources and deriving insight seems a daunting task. This is made easier by using the tools of data science.

Python in Data Science
The programming requirements of data science demands a very versatile yet flexible language which is simple to write the code but can handle highly complex mathematical processing. Python is most suited for such requirements as it has already established itself both as a language for general computing as well as scientific computing. More over it is being continuously upgraded in form of new addition to its plethora of libraries aimed at different programming requirements. Below we will discuss such features of python which makes it the preferred language for data science.

A simple and easy to learn language which achieves result in fewer lines of code than other similar languages like R. Its simplicity also makes it robust to handle complex scenarios with minimal code and much less confusion on the general flow of the program.
It is cross platform, so the same code works in multiple environments without needing any change. That makes it perfect to be used in a multi-environment setup easily.
It executes faster than other similar languages used for data analysis like R and MATLAB.
Its excellent memory management capability, especially garbage collection makes it versatile in gracefully managing very large volume of data transformation, slicing, dicing and visualization.
Most importantly Python has got a very large collection of libraries which serve as special purpose analysis tools. For example – the NumPy package deals with scientific computing and its array needs much less memory than the conventional python list for managing numeric data. And the number of such packages is continuously growing.
Python has packages which can directly use the code from other languages like Java or C. This helps in optimizing the code performance by using existing code of other languages, whenever it gives a better result.
In the subsequent chapters we will see how we can leverage these features of python to accomplish all the tasks needed in the different areas of Data Science.

Wednesday, May 30, 2018

Understanding the concept of Wireless Access Point

 Access Point (AP) is the central node in 802.11 wireless implementations. It is the interface between wired and wireless network, that all the wireless clients associate to and exchange data with.
For a home environment, most often you have a router, a switch, and an AP embedded in one box, making it really usable for this purpose.

Base Transceiver Station
Base Transceiver Station (BTS) is the equivalent of an Access Point from 802.11 world, but used by mobile operators to provide a signal coverage, ex. 3G, GSM etc...

Base Transceiver Station
Note − The content of this tutorial concentrates on the 802.11 wireless networking, therefore any additional information about BTS, and mobile communication in more detail, would not be included.

Wireless Controller (WLC)
In corporate wireless implementation, the number of Access Points is often counted in hundreds or thousands of units. It would not be administratively possible to manage all the AP's and their configuration (channel assignments, optimal output power, roaming configuration, creation of SSID on each and every AP, etc.) separately.

Wireless Controller
This is the situation, where the concept of wireless controller comes into play. It is the "Mastermind" behind all the wireless network operation. This centralized server which has the IP connectivity to all the AP's on the network making it easy to manage all of them globally from the single management platform, push configuration templates, monitor users from all the AP's in real time and so on.

Centralized Server
Service Set Identifier (SSID)
SSID directly identifies the wireless WLAN itself. In order to connect to Wireless LAN, the wireless client needs to send the same exact SSID in the association frame as the SSID name, preconfigured on the AP. So the question now arises how to find out which SSIDs are present in your environment? That is easy as all the operating systems come with a built-in wireless client that scans wireless spectrum for the wireless networks to join (as shows below). I am sure you have done this process several times in your daily routine.

Wireless WLAN
But, how those devices know that specific wireless network is named in that particular way just by listening to radio magnetic waves? It is because one of the fields in a beacon frame (that APs transmit all the time in very short time intervals) contains a name of the SSID always in clear text, which is the whole theory about this.

Beacon Frame SSID
SSID can have a length of up to 32 alphanumeric characters and uniquely identifies a particular WLAN broadcasted by the AP. In case, when the AP has multiple SSIDs defined, it will then send a separate beacon frame for each SSID.

Cell
A cell is basically a geographical region covered by the AP's or BTS's antenna (transmitter). In the following image, a cell is marked with a yellow line.

Cell
Most often, an AP has much more output power, when compared it with the capabilities of the antenna built-in into the client device. The fact that, the client can receive frames transmitted from the AP, does not mean that a 2-way communication can be established. The above picture perfectly shows that situation. - In both situations, a client can hear AP's frames, but only in the second situation, the 2-way communication can be established.
The outcome from this short example is that, when designing the wireless cell sizes, one has to take into account, what is the average output transmitting power of the antennas that clients will use.

Channel
Wireless Networks may be configured to support multiple 802.11 standards. Some of them operate on the 2.4GHz band (example are: 802.11b/g/n) and other ones on the 5GHz band (example: 802.11a/n/ac).
Depending on the band, there is a predefined set of sub-bands defined for each channel. In environments with multiple APs placed in the same physical area, the smart channel assignment is used in order to avoid collisions (collisions of the frames transmitted on exactly the same frequency from multiple sources at the same time).
Channel
Let's have a look at the theoretical design of the 802.11b network with 3 cells, adjacent to each other as shown in the above picture. Design on the left is composed of 3 non-overlapping channels - it means that frames sent by APs and its clients in particular cell, will not interfere with communication in other cells. On the right, we have a completely opposite situation, all the frames flying around on the same channel leads to collisions and degrade the wireless performance significantly.

Antennas
Antennas are used to "translate" information flowing as an electrical signal inside the cable and into the electromagnetic field, which is used to transmit the frame over a wireless medium.
Antennas
Every wireless device (either AP or any type of wireless client device) has an antenna that includes a transmitter and the receiver module. It can be external and visible to everyone around or built-in, as most of the laptops or smartphones nowadays have.
For wireless security testing or penetration tests of the wireless networks, external antenna is one of the most important tools. You should get one of them, if you want to go into this field! One of the biggest advantages of external antennas (comparing to most of the internal antennas you might meet built-in to the equipment), is that they can be configured in a so-called "monitor mode" - this is definitely something you need! It allows you to sniff the wireless traffic from your PC using wire-shark or other well-known tools like Kismet.

There is a very good article on the internet (https://www.raymond.cc/blog/best-compatible-usb-wireless-adapter-for-backtrack-5-and-aircrack-ng/) that helps with the choice of the external wireless antenna, especially for Kali Linux that has monitor mode capabilities. If you are seriously considering going into this field of technology, I really recommend all of you to purchase one of the recommended ones (I have one of them).
Email ThisBlogThis!Share to TwitterShare to Facebook

Understanding Cloud Computing Planning

Before deploying applications to cloud, it is necessary to consider your business requirements. Following are the issues one must consider:

Data Security and Privacy Requirement
Budget Requirements
Type of cloud - public, private or hybrid
Data backup requirements
Training requirements
Dashboard and reporting requirements
Client access requirements
Data export requirements
To meet all of these requirements, it is necessary to have well-compiled planning. In this tutorial, we will discuss the various planning phases that must be practised by an enterprise before migrating the entire business to cloud. Each of these planning phases are described in the following diagram:

Cloud Computing Planning
Strategy Phase
In this phase, we analyze the strategy problems that customer might face. There are two steps to perform this analysis:

Cloud Computing Value Proposition
Cloud Computing Strategy Planning
Cloud Computing Value Proposition
In this, we analyze the factors influencing the customers when applying cloud computing mode and target the key problems they wish to solve. These key factors are:

IT management simplification
operation and maintenance cost reduction
business mode innovation
low cost outsourcing hosting
high service quality outsourcing hosting.
All of the above analysis helps in decision making for future development.

Cloud Computing Strategy Planning
The strategy establishment is based on the analysis result of the above step. In this step, a strategy document is prepared according to the conditions a customer might face when applying cloud computing mode.

Planning Phase
This step performs analysis of problems and risks in the cloud application to ensure the customers that the cloud computing is successfully meeting their business goals. This phase involves the following planning steps:

Business Architecture Development
IT Architecture development
Requirements on Quality of Service Development
Transformation Plan development
Business Architecture Development
In this step, we recognize the risks that might be caused by cloud computing application from a business perspective.

IT Architecture Development
In this step, we identify the applications that support the business processes and the technologies required to support enterprise applications and data systems.

Requirements on Quality of Service Development
Quality of service refers to the non-functional requirements such as reliability, security, disaster recovery, etc. The success of applying cloud computing mode depends on these non-functional factors.

Transformation Plan Development
In this step, we formulate all kinds of plans that are required to transform current business to cloud computing modes.

Deployment Phase
This phase focuses on both of the above two phases. It involves the following two steps:

Selecting Cloud Computing Provider
Maintenance and Technical Service
Selecting Cloud Computing Provider
This step includes selecting a cloud provider on basis of Service Level Agreement (SLA), which defines the level of service the provider will meet.

Maintenance and Technical Service
Maintenance and Technical services are provided by the cloud provider. They need to ensure the quality of services.

A Quick Guide To Cloud Computing Architecture and Infrastructure

Cloud Computing architecture comprises of many cloud components, which are loosely coupled. We can broadly divide the cloud architecture into two parts:

Front End
Back End

Each of the ends is connected through a network, usually Internet. The following diagram shows the graphical view of cloud computing architecture:

Front End
The front end refers to the client part of cloud computing system. It consists of interfaces and applications that are required to access the cloud computing platforms, Example - Web Browser.

Back End
The back End refers to the cloud itself. It consists of all the resources required to provide cloud computing services. It comprises of huge data storage, virtual machines, security mechanism, services, deployment models, servers, etc.

Note
It is the responsibility of the back end to provide built-in security mechanism, traffic control and protocols.

The server employs certain protocols known as middleware, which help the connected devices to communicate with each other.

Cloud infrastructure consists of servers, storage devices, network, cloud management software, deployment software, and platform virtualization.

Cloud Computing Infrastructure Components.
Hypervisor
Hypervisor is a firmware or low-level program that acts as a Virtual Machine Manager. It allows to share the single physical instance of cloud resources between several tenants.

Management Software
It helps to maintain and configure the infrastructure.

Deployment Software
It helps to deploy and integrate the application on the cloud.

Network
It is the key component of cloud infrastructure. It allows to connect cloud services over the Internet. It is also possible to deliver network as a utility over the Internet, which means, the customer can customize the network route and protocol.

Server
The server helps to compute the resource sharing and offers other services such as resource allocation and de-allocation, monitoring the resources, providing security etc.

Storage
Cloud keeps multiple replicas of storage. If one of the storage resources fails, then it can be extracted from another one, which makes cloud computing more reliable.

Infrastructural Constraints
Fundamental constraints that cloud infrastructure should implement are shown in the following diagram:

Cloud Computing Infrastructure Constraints
Transparency
Virtualization is the key to share resources in cloud environment. But it is not possible to satisfy the demand with single resource or server. Therefore, there must be transparency in resources, load balancing and application, so that we can scale them on demand.

Scalability
Scaling up an application delivery solution is not that easy as scaling up an application because it involves configuration overhead or even re-architecting the network. So, application delivery solution is need to be scalable which will require the virtual infrastructure such that resource can be provisioned and de-provisioned easily.

Intelligent Monitoring
To achieve transparency and scalability, application solution delivery will need to be capable of intelligent monitoring.

Security
The mega data center in the cloud should be securely architected. Also the control node, an entry point in mega data center, also needs to be secure.