Baidu地图API接口

项目地址:https://github.com/sskaje/baidu_map_api

这玩意儿本是用来批量查多组点(起点坐标+终点坐标)的行车距离和时间,然后按结果排序做线路推荐的。

方案使用BAE架设代理,客户端提交多组坐标点给代理接口,代理批量进行http请求,组合并返回结果。
api下的文件是给BAE的接口;
client下的是BAE接口的客户端。

方案设定的环境是:服务运行的机房连百度地图前端服务器速度比BAE连百度地图慢。

现在手上应该有28个左右的BAE部署了这个,但是路线信息查询只实现了交通方式为驾车的情况,公交、步行均未考虑,可以自己调整代码结构,处理qt请求。
如果有需求可以邮件索取主机列表。
自己多注册几个小号部署,也就折腾点而已。

Baidu地图API接口 by @sskaje: https://sskaje.me/2013/12/baidu%e5%9c%b0%e5%9b%beapi%e6%8e%a5%e5%8f%a3/

Project: Merge small files on HDFS for Hive table

Project: Merge small files on HDFS for Hive table

Introduction

Github: https://github.com/sskaje/hive_merge

This is a solution for small file problems on HDFS, but Hive table only.

Here is why I wrote this project: Solving Small Files Problem on CDH4.

This script simply INSERT the requested table/partition to a new table, let data be merged by Hive itself, then INSERT back with compression.

Continue reading “Project: Merge small files on HDFS for Hive table” »

Project: Merge small files on HDFS for Hive table by @sskaje: https://sskaje.me/2013/12/project-merge-small-files-hdfs-hive-table/

Incoming search terms:

Solving Small Files Problem on CDH4

This morning when I open my Cloudera Manager, it shows the NameNode server is ‘Concerning’ with a message like ‘The DataNode has xxx blocks. Warning threshold: 200,000 block(s).’.
I tried to google this, said that there might be too many files on HDFS, as DataNode’s default block size is 128MB on my CDH4, a single file with 1 byte would take a 128MB block.

Then I tried hdfs dfs -count to find out number of files of each directory on HDFS, about 70k files under /user/hdfs/.staging and 170k under a folder for Flume-NG.

I’m collecting logs with Flume-NG on CDH4 and trying to analyse with hive, from syslog, sink to HDFS and MySQL(infobright). The HDFS part in the configuration looks like:

Continue reading “Solving Small Files Problem on CDH4” »

Solving Small Files Problem on CDH4 by @sskaje: https://sskaje.me/2013/12/solving-small-files-problem-cdh4/

Collections: Integer factorization

Integer factorization: http://en.wikipedia.org/wiki/Integer_factorization

In number theory, integer factorization or prime factorization is the decomposition of a composite number into smaller non-trivial divisors, which when multiplied together equal the original integer.

msieve: http://sourceforge.net/projects/msieve/

Msieve is a C library implementing a suite of algorithms to factor large integers. It contains an implementation of the SIQS and GNFS algorithms; the latter has helped complete some of the largest public factorizations known

msieve has CUDA supported!!

Continue reading “Collections: Integer factorization” »

Collections: Integer factorization by @sskaje: https://sskaje.me/2013/12/collections-integer-factorization/

Incoming search terms:

Virtualized ARM on Ubuntu

I was finding articles/wikis how to emulate an arm linux (armel) on centos/ubuntu, then I found this from MDN: https://developer.mozilla.org/en-US/docs/Developer_Guide/Virtual_ARM_Linux_environment.

This article uses an old release by linaro which based on Ubuntu natty that can no longer be found on http://ports.ubuntu.com.
As Ubuntu says, armel would not be supported, that’s why the latest code name of ubuntu supporting armel is begin with ‘Q’.

I found another server release and a new nano, tried that with similar commands, notes are below:

Continue reading “Virtualized ARM on Ubuntu” »

Virtualized ARM on Ubuntu by @sskaje: https://sskaje.me/2013/12/virtualized-arm-ubuntu/

Incoming search terms: