Fun with librados (python)

I have recently started to test the python bindings for librados and I’d like to share what I have learned so far.

To begin, you will need a ceph cluster, an easy way to get one going is to use ansible-ceph. I will not get into much detail about ansible-ceph in this post, but it is very straight forward to get it running, just follow the directions on the github README page. For the code on this post, there’s no need to add RGW or MDS nodes. I changed the default box from Ubuntu to CentOS 7.1. When SSHing to node mon0, the rados python module is already installed, so you should be able to run the code examples from there.

Continue reading

Openstack Swift hackathon report

The Openstack Swift community held their mid-cycle meeting last week. It was hosted by SwiftStack in their new San Francisco HQ. It was attended by 16 contributors representing 8 different companies: Red Hat[3], Intel[1], SwiftStack[3], HP[2], IBM[1], NTT[3], Fujitsu[1] and Rackspace[2].

We started the week listing what looked liked an almost impossible large number of topics to be discussed. The main focus of the hackathon was the Erasure Coding work that is on-going and targeted for the Kilo release, but we created a list with over 20 topics to be discussed and worked on in 4 days.

I’m glad to say we were able to cover each one of the topics we listed, resulting in a very productive week. The output of those discussions were either in the form of patches being merged after reviews, new patches being submitted for review or specs being written to record design ideas and decisions for new upcoming features. Projects have been proposed or taken-on by the different members of the community and the companies they represent. Basically, everyone left with many action items for the projects they are working on.

To accomplish our goal of covering all those topics we had to break up into smaller groups and have some of the work be done in parallel, this enabled contributors to participate in the discussions of projects they were already working on or new features they are interested in.

Below is a quick summary/status of the discussions I participated in:

Erasure Codes:

EC is being targeted to be released as part of the Kilo release, during this week we reviewed code regarding the PUT and GET requests as well as the reconstructor. Testing has been limited so far as the GET patches were only posted this week. Up until now it was possible to write data, but not retrieve it. With the majority of the GET code base being put up for review, the amount of testing being done should pick up quickly. This work is being led by Paul Luse at Intel and SwiftStack.
Specs and Patches:

Encryption:

Data at-rest encryption is still in the design phase. There is a spec up for review and a lot of new discussion happened all week. HP and IBM were leading the discussions on this feature.
Specs:

Container Sharding:

One of Swift’s weak points is container performance degradation as the size of containers grow. A spec has been proposed on how to solve this issue with container-sharding. The first version of the spec listed 3 alternatives and during the hackathon one of the alternatives was chosen as the most adequate. The work on this should continue with a new version of the spec being put up for review and then some code patches. This work is being led by Matthew Oliver from Rackspace.
Spec:

Etags on large objects:

Currently Swift calculates the etag header of large objects (SLO) differently from “regular” objects. While the etag of a “regular” object is the md5 checksum of its content. The etag of a large object is the md5 of the concatenation of each one of its segments etag. This is only useful if the client application has that information available and can perform the same calculation. NTT has been looking for ways to calculate the etag of a large object based on the content of the data, just like “regular” objects. This has proven to be a difficult problem to solve and more investigation will be needed.

Storlets (IBM):

During the Paris Summit, IBM presented a couple of talks on a new technology they have been researching called Storlets. This technology would allow the use of Swift storage nodes as compute nodes. Typically data processing is done on compute nodes separate from where data is actually stored. The problem with this approach is that large amounts of data need to be transferred, which becomes inefficient. Storlets would move the processing of the data to storage systems instead of the other way around.

The good news about this effort is that IBM is looking to open source this technology. During the hackathon we held a conference call with IBM to hear a status of the project and discuss ideas on where this new code should live and how it can be integrated with Swift.
Blog on Storlets:

Change partition power:

Another weak point of Swift is that deployers must choose their partition power when first deploying their cluster and can never change it, which makes it very difficult for operators that are looking to start with a small cluster and grow overtime. A spec has been proposed to start new work on solving this problem. An overview was given of the proposed solution and implementation ideas were discussed. Alistair Coles from HP proposed the spec with references to previous work done in this area by Christian Schwede (Red Hat).
Spec:

Fast-Post:

Alistair Coles (HP) has a design spec and couple of patches up for review to solve the issue of changing some specific container metadata on POST. The biggest challenge with this work is making sure that metadata is correctly replicated and reconciled on a eventually-consistent system such as Swift.
Specs and patches:

Change policy on container:

NTT is also investigating ways of allowing Swift admins to change the storage policy on a container. For example, a container is created with a 3-replica policy and after some time the admin would like to change it to a 2-replica policy. NTT proposed some ideas on how the data could be migrated without the need to move the data to a new container. There is no spec for this yet.

Object undelete:

Christian Schwede (Red Hat) led the discussion on some ideas to provide a new feature where operators would be able to configure their clusters to hold on to data for some configurable period of time after a DELETE request. This would help prevent un-wanted data deletion caused by either mistakes or abuse of the system.
Spec:

fsync on dirs:

Prashanth Pai (Red Hat) has a patch up for review regarding fsyncing directories after a rename. This is would close the small window of data loss vulnerability but at a cost of performance loss. Ideas were discussed on how to best address the issue.
Patch:

Object Versioning middleware:

We had a live code review of this patch that I have been working on. Currently object versioning is embedded in the PUT method of the proxy and this patch moves that code to a new middleware. Some needed changes were identified and a new patch will be put up for review soon.
Patch:

Single Process:

I have been leading this project, which will allow for better performance when deploying storage policies with third-party backends. We had a walk through of the current code that is up for review for this project. The idea has been well received by the community and core developers have demonstrated willingness to review and work together with us. During the week we identified common places where a refactoring of the Object controller will benefit both this project as well as the EC project.
Patch:

Container aliases:

The basic idea is to set an alias to a container, for example with a target container inside another account. This would make it much easier to access that container, because the user doesn’t need to use a different storage URL for access. A while ago Christian Schwede (Red Hat) wrote a middleware and proposed this (https://review.openstack.org/#/c/62494), but due to some edge cases we weren’t able to merge it yet. The biggest problem is the eventual consistency when setting the alias. The discussion came up again, thus a spec has been submitted for review. Hopefully we will be able to find a consensus on this and how to proceed.
Spec:

Community:

There were some discussions on how to better provide status of on-going projects to the community at large. We also had some talks on the spec review process and bug list grooming.

There were also talks and code reviews on a number of other projects: Service Tokens, ring optimizations, patches on replicator and updater, Container tempurl keys, FQDN in rings, and Keystone policy (ACLs) files.

John Dickinson (Swift PTL) has also written a summary of this event here.

P.S.: It’s important to note that these projects do not represent all the work that is happening on the Swift community, rather it’s just a subset of the projects that the people attending the Hackathon are working on.

Listando dados do Vimeo em um aplicativo para Android

Verão passado eu estava em um curso de desenvolvimento para dispositivos móveis e tive a oportunidade de aprender um pouquinho sobre desenvolvimento para Android. Já se passaram alguns meses desde que terminei a classe, mas vou tentar estimular minha memória e comentar um pouco sobre o código que escrevi antes que eu esqueça tudo.

Para meu último projeto, eu tentei criar um pequeno aplicativo para minha Igreja e em uma das “Atividades”, eu mostrava uma lista com nossos videos que estão postados na nossa conta do Vimeo. É uma “atividade” com uma simples lista, o que me permitiu focar no trabalho mais interessante de fazer o download dos dados e fazer uma análise sintática  da lista do Vimeo.

O Vimeo disponibilizou uma API simples para desenvolvedores obterem informações sobre videos, usuários, grupos, etc… A API suporta multiplos formatos de resposta. Eu usei o formato JSON e na minha Activity eu usei as classes JSONArray e JSONObject para analisar a resposta. A API está bem documentada no site do Vimeo, mas basicamente é assim:

http://vimeo.com/api/v2/username/request.output

username – id do usuário
request – Os dados que você quer.
output – Tipo de formato (JSON, PHP, XML)

A classe Activity continha uma classe privada que extendia AsyncTask.

private class LongRunningGetIO extends AsyncTask <Void, Void, String> {

Depois, dois metodos foram sobreescritos: doInBackground e onPostExecute. No doInBackground, a requisição é enviada para obter os dados.

@Override
protected String doInBackground(Void... params) {

  HttpClient httpClient = new DefaultHttpClient();
  HttpContext localContext = new BasicHttpContext();
  HttpGet httpGet = 
    new HttpGet("http://vimeo.com/api/v2/<your_username>/videos.json");
  String text = null;

  try {
    HttpResponse response = httpClient.execute(httpGet, localContext);
    HttpEntity entity = response.getEntity();

    // TODO: change and test to use: EntityUtils.toString(httpEntity);
    text = getASCIIContentFromEntity(entity);
  } catch (Exception e) {
    return e.getLocalizedMessage();
  }

  return text;
}

onPostExecute analisa a resposta e adiciona o resultado num ArrayList. Usando um SimpleAdapter, o título e as datas dos vídeos são exibidos num widget ListView. Veja o código abaixo:

protected void onPostExecute(String results) {
  if (results!=null) {
    ArrayList<HashMap<String, String>> videoList =
      new ArrayList<HashMap<String, String>>();

    // parse json response and add to a list
    try {
      JSONArray jsonVideoList = new JSONArray(results);
      for(int i = 0; i < jsonVideoList.length(); i++) {
        JSONObject video = jsonVideoList.getJSONObject(i);
        HashMap<String, String> map = new HashMap<String, String>();

        // Storing each json item
        map.put("title", video.getString("title"));
        map.put("mobile_url", video.getString("mobile_url"));

        // format the date, but adding both date formats,
        // just in case it is needed in the future
        video.getString("upload_date");
        String json_upload_date = video.getString("upload_date");
        Date date =
          new SimpleDateFormat("yyyy-MM-dd HH:mm:ss", Locale.US).
          parse(json_upload_date);
        String fDate = new SimpleDateFormat("MMM d, yyyy", Locale.US).
          format(date);
        map.put("upload_date", json_upload_date);
        map.put("upload_date_formatted", fDate);
        videoList.add(map);
      }

    } catch (Exception e) {
    }

    // add video list to ListView widget
    String from[] = {"title", "upload_date_formatted"};
    int to[] = { android.R.id.text1, android.R.id.text2 };
    ListAdapter adapter = new SimpleAdapter(SermonActivity.this,
      videoList, android.R.layout.simple_list_item_2,
      from, to);
    ListView lv= (ListView)findViewById(R.id.sermonListView);
    lv.setAdapter(adapter);

    lv.setOnItemClickListener(new AdapterView.OnItemClickListener() {

      @Override
      public void onItemClick(AdapterView<?> parent, final View view,
        int position, long id) {

        @SuppressWarnings("unchecked")
        HashMap<String, String> map = (HashMap<String, String>)
        parent.getItemAtPosition(position);

        // launch video
        startActivity(new Intent(Intent.ACTION_VIEW,
          Uri.parse(map.get("mobile_url"))));
      }
    });
  }
}

Listing Vimeo data on Android application

Last summer I took a mobile development class and got the chance to learn a little bit about Android development. It has been a couple of months since I finished the class, but I will attempt to juggle my memory and comment a little bit about the code I wrote there before I completely forget everything.

For my final project, I tried to create a little app for my church and in one of the Activities, I displayed a list of our videos that are posted on Vimeo. It’s a very simple Activity with just a list, but it let me focus on the more interesting work of downloading and parsing the list from Vimeo.

Vimeo has made available a simple API for developers to get information about public videos, users, groups, etc…The API supports multiple response formats. I used the JSON format and in my Activity I used classes JSONArray and JSONObject to parse the response. The API request is well documented on Vimeo’s website, but it basically looks like this:

http://vimeo.com/api/v2/username/request.output

username – id of the user
request – The data you want.
Output – Output type (JSON, PHP, XML)

The Activity class contained a private class that extended AsyncTask.

private class LongRunningGetIO extends AsyncTask <Void, Void, String> {

Then, two methods were overriden: doInBackground and onPostExecute. In doInBackground, the request is sent to get the data.

@Override
protected String doInBackground(Void... params) {

  HttpClient httpClient = new DefaultHttpClient();
  HttpContext localContext = new BasicHttpContext();
  HttpGet httpGet = 
    new HttpGet("http://vimeo.com/api/v2/<your_username>/videos.json");
  String text = null;

  try {
    HttpResponse response = httpClient.execute(httpGet, localContext);
    HttpEntity entity = response.getEntity();

    // TODO: change and test to use: EntityUtils.toString(httpEntity);
    text = getASCIIContentFromEntity(entity);
  } catch (Exception e) {
    return e.getLocalizedMessage();
  }

  return text;
}

onPostExecute parses the response and adds the result to an ArrayList. Using a SimpleAdapter, the title and dates of the videos are displayed in the ListView widget. See code below:

protected void onPostExecute(String results) {
  if (results!=null) {
    ArrayList<HashMap<String, String>> videoList =
      new ArrayList<HashMap<String, String>>();

    // parse json response and add to a list
    try {
      JSONArray jsonVideoList = new JSONArray(results);
      for(int i = 0; i < jsonVideoList.length(); i++) {
        JSONObject video = jsonVideoList.getJSONObject(i);
        HashMap<String, String> map = new HashMap<String, String>();

        // Storing each json item
        map.put("title", video.getString("title"));
        map.put("mobile_url", video.getString("mobile_url"));

        // format the date, but adding both date formats, 
        // just in case it is needed in the future
        video.getString("upload_date");
        String json_upload_date = video.getString("upload_date");
        Date date = 
          new SimpleDateFormat("yyyy-MM-dd HH:mm:ss", Locale.US).
          parse(json_upload_date);
        String fDate = new SimpleDateFormat("MMM d, yyyy", Locale.US).
          format(date);
        map.put("upload_date", json_upload_date);
        map.put("upload_date_formatted", fDate);
        videoList.add(map);
      }

    } catch (Exception e) {
    }

    // add video list to ListView widget
    String from[] = {"title", "upload_date_formatted"};
    int to[] = { android.R.id.text1, android.R.id.text2 };
    ListAdapter adapter = new SimpleAdapter(SermonActivity.this,
      videoList, android.R.layout.simple_list_item_2,
      from, to);
    ListView lv= (ListView)findViewById(R.id.sermonListView);
    lv.setAdapter(adapter);

    lv.setOnItemClickListener(new AdapterView.OnItemClickListener() {

      @Override
      public void onItemClick(AdapterView<?> parent, final View view,
        int position, long id) {

        @SuppressWarnings("unchecked")
        HashMap<String, String> map = (HashMap<String, String>)
        parent.getItemAtPosition(position);

        // launch video
        startActivity(new Intent(Intent.ACTION_VIEW,
          Uri.parse(map.get("mobile_url"))));
      }
    });
  }
}

Configuring keyboard in Fedora 19

When configuring a new OS installation, one of the first things I need to do is to configure the keyboard to type accented characters, so I can type in Portuguese correctly. To do this in Fedora 19 (and with a US keyboard layout. This will not apply for non-US keyboards, but it’s basically the same idea), follow these steps:

  1. Activities > Settings > Region and Language
  2. Click the Plus icon under Input Sources
  3. Select English (United States)
  4. Select English (Mali, US international)

Note: Most people in Brasil are used to typing ' + c to get ç; in Fedora 19, users must hold the right Alt + comma (,). All other accented characters remain the same.