Start learning Rust

I don’t know the reason, but I want to do something different from what I’m usually doing. And I want to have a familiar programming language which does not have garbage collection (GC). There are two candidates for me, swift and Rust.

I think swift is more popular and will be getting more popular. But I started to learn rust because its memory management mechanism is interesting for me. The official web site says that the following.

  • guaranteed memory safety
  • threads without data races

Even though we use a language which has GC such as Java, we need to care about memory management especially in operation phase. I think it’s more difficult to write an application in Rust than other modern programming languages which have GC. But once it’s done, it’s easier to operate the application wrrtten in Rust.

I think I would feel that I can use a sharper knife when I get used to Rust.

Cast a List type object without iteration

There are three choices as follows:

        List<String> list = Arrays.asList("1","2");

        List<Object> objectList = (List) list;

        List<Object> objectList2 = (List<Object>)(List<?>) list;

        List<Object> objectList3 = new ArrayList<Object>(list);

But the choice 1 and 2 seem to be dangerous because the exception will be thrown when the item of the list is fetched if the list is not a List<Object>.

HttpClient in Java

Both Apache HttpClient and OkHttpClient are thread-safe.

Apache HttpClient

HttpClient - HttpClient Performance Optimization Guide

Chapter 2. Connection management

OkHttpClient

Concurrency · square/okhttp Wiki · GitHub

OkHttp3 does not use a global singleton thread pool. Each OkHttpClient object can have its own private connection pool.

okhttp/CHANGELOG.md at master · square/okhttp · GitHub

Markov Chain Monte Carlo

One of the goal of the current my study about statistics is understanding Markov Chain Monte Carlo(MCMC).

I would like to write about one of the popular algorithms of MCMC, Metropolis–Hastings algorithm. It’s a kind of rejection sampling algorithm.

Let  P(x) be a probability distribution of a target distribution( P) that you would like to draw samples from.

Let  g(x) be a probability distribution of a proposal distribution( G that you already know how to draw samples from.

  1. Draw a sample  x_{n+1} from  P (The previous sample is  x_n)

  2. Accept the sample with a probability  \min\left(1,\frac{P(x_{n+1})}{P(x_n)}\frac{g(x_n | x_{n+1})}{g(x_{n+1} | x_n)}\right). Otherwise, reject it.

    It means that you always accept the sample when  \frac{P(x_{n+1})}{P(x_n)}\frac{g(x_n | x_{n+1})}{g(x_{n+1} | x_n)} \geq 1. And you accept the sample with the probability  \frac{P(x_{n+1})}{P(x_n)}\frac{g(x_n | x_{n+1})}{g(x_{n+1} | x_n)} when it’s not.

  3. Repeat 1 and 2 util the sample’s probability distribution is close enough to the target.

Let’s think about why this algorithm works.

Reversible Markov chain that I explained in the previous post plays an important role of this algorithm.

Let’s assume that samples from the target probability distribution are generated based on a reversible Markov chain which has a stationary distribution  \pi(x) = P(x).

Since the Markov chain is reversible, each transition( x_n -> x_{n+1}) of the Markov chain fulfils the following condition:

 P(x_n | x_{n+1})P(x_{n+1}) = P(x_{n+1} | x_n)P(x_n)

When  \frac{P(x_{n+1})}{P(x_n)}\frac{g(x_n | x_{n+1})}{g(x_{n+1} | x_n)} \geq 1 is true, we always accept a sample.

 \frac{P(x_{n+1})}{P(x_n)}\frac{g(x_n | x_{n+1})}{g(x_{n+1} | x_n)} \geq 1 \Leftrightarrow P(x_{n+1})g(x_n | x_{n+1}) \geq P(x_n)g(x_{n+1} | x_n)

If  P(x) = g(x), then  P(x_{n+1})g(x_n | x_{n+1}) =P(x_n)g(x_{n+1} | x_n), but it’s not. It means the proposal distribution  g(x_{n+1} | x_n) causes the chain to move from  x_n to  x_{n+1} too rarely and from  x_{n+1} to  x_n too often. Even though this condition, we get  x_{n+1} from G. Then we accept the sample  x_{n+1}.

On the other hand, when  \frac{P(x_{n+1})}{P(x_n)}\frac{g(x_n | x_{n+1})}{g(x_{n+1} | x_n)} < 1, we accept the sample with the probability  \frac{P(x_{n+1})}{P(x_n)}\frac{g(x_n | x_{n+1})}{g(x_{n+1} | x_n)}.

 \frac{P(x_{n+1})}{P(x_n)}\frac{g(x_n | x_{n+1})}{g(x_{n+1} | x_n)} \Leftrightarrow P(x_{n+1})g(x_n | x_{n+1}) <  P(x_n)g(x_{n+1} | x_n)

It means the proposal distribution  g(x_{n+1} | x_n) causes the chain to move from  x_n to  x_{n+1} too often and from  x_{n+1} to  x_n too rarely.

We need to adjust the transition by using acceptance distribution  A(x_{n+1} | x_n) such that:

 P(x_{n+1})g(x_n | x_{n+1}) =A(x_{n+1} | x_n)  P(x_n)g(x_{n+1} | x_n)

 A(x_{n+1} | x_n) = \frac{P(x_{n+1})}{P(x_n)}\frac{g(x_n | x_{n+1})}{g(x_{n+1} | x_n)}

This is my intuitive understanding of Metropolis–Hastings algorithm.

More formal way of explanation from wikipedia is as follows:

 P(x_n | x_{n+1})P(x_{n+1}) = P(x_{n+1} | x_n)P(x_n) can be re-written as

 \frac{P(x_{n+1} | x_n)}{P(x_n | x_{n+1})} = \frac{P(x_{n+1})}{P(x_n)}

We draw a sample from the proposal distribution G and accept it with a certain probability  A(x_{n+1} | x_n) to simulate the target distribution P.

 P(x_{n+1} | x_n) = A(x_{n+1} | x_n) g(x_{n+1} | x_n)

Let’s insert this to the previous equation.

 \frac{A(x_{n+1} | x_n) g(x_{n+1} | x_n)}{A(x_ | x_{n+1}) g(x_n | x_{n+1})} = \frac{P(x_{n+1})}{P(x_n)}

  \Leftrightarrow  \frac{A(x_{n+1} | x_n) }{A(x_ | x_{n+1}) } = \frac{P(x_{n+1})g(x_n | x_{n+1})}{P(x_n)g(x_{n+1} | x_n)}

Then we need to choose a acceptance distribution  A(x_{n+1} | x_n) which fulfils this condition. The choice is used in the Metropolis–Hastings algorithm is as follows:

 A(x_{n+1} | x_n) =  \min\left(1,\frac{P(x_{n+1})}{P(x_n)}\frac{g(x_n | x_{n+1})}{g(x_{n+1} | x_n)}\right)

This fulfils the previous condition because

When   \frac{P(x_{n+1})}{P(x_n)}\frac{g(x_n | x_{n+1})}{g(x_{n+1} | x_n)} \geq 1, then  A(x_{n+1} | x_n) = 1.

When   \frac{P(x_{n+1})}{P(x_n)}\frac{g(x_n | x_{n+1})}{g(x_{n+1} | x_n)} < 1  \Leftrightarrow \frac{P(x_n)}{P(x_{n+1})}\frac{g(x_{n+1} | x_n)}{g(x_n | x_{n+1})} > 1, then  A(x_ | x_{n+1}) = 1.

Multiple-Column Indexes on MySQL

MySQL can use multiple-column indexes for queries that use all the columns, the first two columns, the first three columns and so on.

MySQL :: MySQL 5.7 Reference Manual :: 9.3.5 Multiple-Column Indexes

The following index is equivalent to

INDEX(columnA, columnB, columnC)

the following inedex

INDEX(columnA, columnB, columnC)
INDEX(columnA, columnB)
INDEX(columnA)

See the rows of the following execution plans:

mysql> CREATE TABLE test (
    ->     id         INT NOT NULL,
    ->     last_name  CHAR(30) NOT NULL,
    ->     first_name CHAR(30) NOT NULL,
    ->     PRIMARY KEY (id),
    ->     INDEX name (last_name,first_name)
    -> );
Query OK, 0 rows affected (0.04 sec)

mysql> INSERT test (id, last_name, first_name) VALUES (1, 'last', 'first'), (2, 'last2','first2');
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from test where last_name = 'last' and first_name = 'first';
+----+-----------+------------+
| id | last_name | first_name |
+----+-----------+------------+
|  1 | last      | first      |
+----+-----------+------------+
1 row in set (0.00 sec)

mysql> explain select * from test where last_name = 'last' and first_name = 'first';
+----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref         | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+-------------+
|  1 | SIMPLE      | test  | NULL       | ref  | name          | name | 180     | const,const |    1 |   100.00 | Using index |
+----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select * from test where last_name = 'last' ;
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | test  | NULL       | ref  | name          | name | 90      | const |    1 |   100.00 | Using index |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select * from test where first_name = 'first';
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | test  | NULL       | index | NULL          | name | 180     | NULL |    2 |    50.00 | Using where; Using index |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

mysql>

Reversible Markov Chain

In the previous post, I introduced some of important Markov chain properties.

I would like to write about a Markov chain property that is used in Markov Chain Monte Carlo(mcmc), which is reversibility.

Reversibility

Let  X_n is a variable of a Markov chain. If there is a probability distribution  \pi such that:

 \displaystyle \pi_i P(X_{n+1} = j \mid X_{n} = i) = \pi_j P(X_{n+1} = i \mid X_{n} = j)

Then the Markov chain is said to be reversible.

One of trivial examples of reversible Markov chain is as follows:

f:id:nakaly:20170221030823p:plain:w300

A reversible Markov chain has a stationary distribution

When defining  p_{ij} as follows:

 \displaystyle p_{ij} = P(X_{n+1} = i \mid X_{n} = j)

The equation of revesibility can be written as follows:

 \displaystyle \pi_i p_{ij} = \pi_j p_{ji} for all states i,j of a Markov chain

Since the equation is true for all  i, when we sum up the left- and right-hand sides of the equation.

We get follows:

 \displaystyle \sum_i \pi_i p_{ij} = \sum_i \pi_j p_{ji} = \pi_j \sum_i p_{ji} = \pi_j

It means that we get the following equation.

 \displaystyle \pi_j = \sum_i \pi_i p_{ij}

This is the definition of a stationary distribution  \pi.