Sunday, March 22, 2015

Checking out GitHub pull requests locally

When working with GitHub you often need to checkout a pull request (PR) locally so you can load it in your favorite tools and run/test it.

GitHub help suggests you can use a command similar to:

git fetch origin pull/ID/head && git checkout FETCH_HEAD
(here ID is the number of the pull request)

While this will give you the original code of the PR, it might be different from what you will get if you actually merge the PR. The reason is that you might have parallel changes in your target (master) branch not yet merged in the PR. While you can do the merge also locally, it turns out this is not necessary as GitHub had alsready done it for you. All you need is to use this command instead:

git fetch origin pull/ID/merge && git checkout FETCH_HEAD
(notice the difference in the refspec 'head' vs. 'merge')

This will give you a merged version of the PR which contains all parallel commits in the target branch even those merged after the PR was created.

Sunday, January 11, 2015

Rip audio CDs on LINUX

I still use audio CDs sometimes. But they tend to get lost or damaged easily. So it is a good practice to convert them to MP3.
So far I used Asunder for CD ripping. It is very easy to use ... when it works. But with some discs, usually lower quality CD-R, it just hangs right from the start. So I searched for another tool to rip audio CDs on LINUX.
It turned out you can do this very quickly with two command line tools - cdparanoia & lame. As usual you can quickly install them on Ubuntu with a single command line:

$ sudo apt-get install cdparanoia lame

Assuming the CD is loaded in the CD drive, running this simple command will copy all audio tracks in WAV files in the current directory:

$ cdparanoia -B

Next to convert those WAV files to MP3, run this command:

$ ls -1 | xargs -L 1 lame --preset standard

This will compress the audio files about 10 times using VBR ~190kbps.
If you are satisfied with the result, you can delete all WAV files:

$ rm *.wav

This will leave only MP3 files named like track04.cdda.mp3.
Of course these tools have many more options so you can tweak them as much as you like. For example lame option --ta sets the artist and --tl the album in ID3 tags inside MP3 files.
You can also script and automate this process as you see fit, but these are the tools that do the job nice and quickly.

BTW did you know that "disk" refers to magnetic storage while "disc" refers to optical storage? See Wikipedia.

Wednesday, September 3, 2014

npm - first encounters

Playing with node.js at work I hit an issue in the very beginning as I was unable to install any package using npm (the convenient package manager of node).

$ npm install nodemon
npm ERR! network tunneling socket could not be established, cause=connect EINVAL
npm ERR! network This is most likely not a problem with npm itself
npm ERR! network and is related to network connectivity.
npm ERR! network In most cases you are behind a proxy or have bad network settings.
npm ERR! network 
npm ERR! network If you are behind a proxy, please make sure that the
npm ERR! network 'proxy' config is set properly.  See: 'npm help config'

Ok, I really use a proxy so I run npm config edit and uncomment these lines:

; proxy=proxy:8080
; https-proxy=proxy:8080

Same result

$ npm install nodemon
npm ERR! network tunneling socket could not be established, 

The error EINVAL suggests that connect was called with an invalid argument. What that might be? Let's see the system calls of npm:

$ strace npm install nodemon 1> npm.strace 2>&1
$ grep EINVAL npm.strace
ioctl(9, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7fff5f9a8330) = -1 EINVAL (Invalid argument)
connect(10, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("0.0.31.144")}, 16) = -1 EINVAL (Invalid argument)
connect(10, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("0.0.31.144")}, 16) = -1 EINVAL (Invalid argument)
connect(10, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("0.0.31.144")}, 16) = -1 EINVAL (Invalid argument)
write(2, " tunneling socket could not be e"..., 65 tunneling socket could not be established, cause=connect EINVAL

Ahaa, we see 3 calls to connect with IP 0.0.31.144 and port 443 (default HTTPS port) and all of these returned EINVAL (Invalid argument).
What is this strange IP? Asking Google for it, revealed this post according to which, environment variable http_proxy should be given with a protocol, e.g.

http_proxy=http://proxy:8080

So setting https-proxy=http://proxy:8080 in npm config did solve the problem!

On my Linux both of these are set:

https_proxy=http://proxy:8080
HTTPS_PROXY=proxy:8080

and it seems npm takes the upper case variable to set the default values in npm config.

Wednesday, May 28, 2014

Dynamic auto-complete in Android

Auto-complete text input is very common in modern UI. It makes it easy to select an item from a large list or just provide some hint on matching items. Often the whole list of available items is not available, so you need to lookup matching items in some external source.
In this example we use auto-complete to select a stock, similar to the search field on finance.yahoo.com.

Here we use a web API from Yahoo to lookup matching stocks, but you can use the same approach with any way of populating the auto-complete drop-down dynamically. For example you could search in a database.

These are the major objects involved:
AutoCompleteTextView -> Adapter -> Filter



android.widget.AutoCompleteTextView is the standard Android widget for this purpose. We will use it as it is but will implement custom Adpter and Filter.

 symbolText = new AutoCompleteTextView(getActivity());
 symbolText.setAdapter(new StockLookupAdapter(getActivity()));

Here is our custom Adapter:

public class StockLookupAdapter extends
        ArrayAdapter<StockLookupAdapter.StockInfo> {

    private static final String LOG_TAG = StockLookupAdapter.class
            .getSimpleName();

    class StockInfo {
        public String symbol;
        public String name;
        public String exchange;

        @Override
        public String toString() {
            // text to display in the auto-complete dropdown
            return symbol + " (" + name + ")";
        }
    }

    private final StockLookupFilter filter = new StockLookupFilter();

    public StockLookupAdapter(Context context) {
        super(context, android.R.layout.simple_list_item_1);
    }

    @Override
    public Filter getFilter() {
        return filter;
    }

    private class StockLookupFilter extends Filter {
        ...
    }
}

Here StockInfo carries the data for each item in the drop down. Here we store the properties of each stock like symbol (a.k.a. ticker) and name. We override getFilter to return the custom Filter - StockLookupFilter. This is the essential part.
Here is what android.widget.Filter docu says:

Filtering operations performed by calling filter(CharSequence) or filter(CharSequence, android.widget.Filter.FilterListener) are performed asynchronously. When these methods are called, a filtering request is posted in a request queue and processed later. Any call to one of these methods will cancel any previous non-executed filtering request.

This is exactly what we need as calling a web API usually takes some time so we should not do it in the UI thread. Also the user may type faster than the web API can return the results. This could result in the hints shown in the drop-down lagging considerably behind the current text state. The queuing described above helps avoid this effect.

So here is our custom filter (nested inside StockLookupAdapter):
private class StockLookupFilter extends Filter {

    // Invoked in a worker thread to filter the data according to the
    // constraint.
    @Override
    protected FilterResults performFiltering(CharSequence constraint) {
        FilterResults results = new FilterResults();
        if (constraint != null) {
            ArrayList<StockInfo> list = lookupStock(constraint);
            results.values = list;
            results.count = list.size();
        }
        return results;
    }

    private ArrayList<StockInfo> lookupStock(CharSequence constraint) {
        ...
    }

    // Invoked in the UI thread to publish the filtering results in the user
    // interface.
    @Override
    protected void publishResults(CharSequence constraint,
            FilterResults results) {
        setNotifyOnChange(false);
        clear();
        if (results.count > 0) {
            addAll((ArrayList<StockInfo>) results.values);
            notifyDataSetChanged();
        } else {
            notifyDataSetInvalidated();
        }

    }

    @Override
    public CharSequence convertResultToString(Object resultValue) {
        if (resultValue instanceof StockInfo) {
            // text to set in the text view when an item from the dropdown
            // is selected
            return ((StockInfo) resultValue).symbol;
        }
        return null;
    }

}

perfromFiltering is executed in a background thread and it finds the items to be shown in the drop-down based on the current text in the text field.
publishResults is executed on the UI thread and it is given the FilterResults returned by perfromFiltering. Here we just reset the ArrayAdapter contents and notify the UI to update.
convertResultToString returns the string to be substituted in the text field when a given item from the drop-down is selected. In our case we display both stock symbol and name in the drop-down but want only the symbol in the text field.

So as we can see simple text navigation can be very efficient. Probably this is the reason why it is so popular these days.

P.S.
Still there is one glitch that irritates me. It seems part of the the drop-down is covered by the on-screen keyboard. If I try to close the keyboard, the drop-down is closed first.

Friday, April 4, 2014

Does destruction change anything?

class C{};

// consider this

void foo(const C* p)
{
    delete p;
}

// does it work? should it work?
// after all destroyng the object very much changes it 
// and you are not allowed to change a const object, right?
// ...
// now consider this
// (you can substitute here auto_ptr 
// with your favorite smart pointer)

void foo(auto_ptr<const C> p) { }

// is this possible at all?
// ...
// how about this

void foo(const C x) { }

// hmm... this is pretty common code
// if const objects exist (and we know they do) 
// then they must come to an end somehow

// so it is possible and completely normal to destroy 
// a constant object and all of the above is valid code

// so now my interpretation of const is this:
// if the object still exists, 
// its observable state should be the same as before

Based on this post on SO

Saturday, December 21, 2013

Back to native

Usually the end of the year is time to take a step back, reflect on the past and plan for the future. For me it is also a time for a change. After 9 years working in Java (SE) I will switch back to C++. So I'll use the opportunity to share some thoughts I have accumulated in recent years.
Generally I avoid being a language zealot (like some of my colleagues ;). For me it is more important to create simple and beautiful solutions leveraging whatever technology is available.

Some say Java tends to be too ceremonial and I would agree on that.
Nowadays the industry has accepted that checked exceptions are not so useful after all.
Proper resource management with all this exception handling feels very clumsy and is definitely inferior compared to the good old RIIA mechanism available in C++.
Some design mistakes from the yearly day of Java are still present for the sake of compatibility. These are things like decorator pattern overuse in stream handling and all these unnecessary methods in Object most of which you rarely use but constantly pay the price for.

It was interesting for me to see how C++ has advanced for the last decade. I see there are many new and useful features C++11. For example lambdas are now part of the standard while in Java they have been delayed for years and are still not present.
The preprocessor and the necessity to use it has always been for me one of the major issues in C++. So I was hoping that this was addressed somehow in the recent evolution of the language. Alas! All these #include and #define directives and the involved text substitution without any awareness of the language feel like a stone age hack. And it is still there.
Even before I started working in Java I thought it would be great if C++ could be compiled to some standard intermediate format. This could improve both compile-time and runtime performance. When you compile to machine instructions ahead-of-time (AOT) you rarely know the exact CPU where the code will run, so you cannot take advantage of new CPU features. I was glad to find recently that there is already a considerable effort in that direction - namely LLVM project even though it is not C/C++ specific.

Wednesday, May 22, 2013

Java Serialization Performance

The Java program I work on in my day job maintains its data model in a tree-like structure of Java beans. The data volume is not big but still it is significant - about 500K property values in about 30K bean objects.
To ensure our program can restart from point of failure we often have to save this data model to disk. We use custom XML serialization. As this takes quite some time (several seconds) I started researching different alternatives.

It turned out that JAXB is not appropriate for the job as it does not cope well with cyclic dependencies and cross-references. It also requires adding a lot of annotations and our data model employs about 50 classes.

Next I tried standard Java object serialization but it turned out to preform even worse - it saved the data model in 15s in a 10MB file.

A quick search about Java serialization showed an open source library Kryo. I was able to test it very easily as it requires no annotations and almost no code changes. Kryo turned out to be lightning fast! It saved the same data model in just 400ms in a 5MB file. At first I did not believe all our data was saved, so I loaded it back and compared it to the original data but there were no differences.
One of the main reasons for this performance is that instead of reflection Kryo uses dynamic byte code generation via ReflectASM.
I also found another project run by Nate - Spine to be very interesting. It is related to game development. Obviously a very skilled developer.
Kudos Nate, great job!