Algorithm Notes – 第 2 頁

4 3 月, 20174 3 月, 2017

x264 c++ encode example

有了 libx264 當然要試編一下影片，encode 出的檔案只是為 h264 的 bitstream ，並不是直接封裝成可直接撥放的檔案，還需要一些 wrapper 程序封裝成常見的影片格式如 ( mp4, mkv …)

不過也可以懶人用 ffmpeg 封裝 XD

ffmpeg -i in_file.h264 -vcodec copy out_file.mp4

這基本上就是一連串的繼承寄生關係，階層越高越笨重，也越複雜，這邊就第一步 codec 層來做個小實驗。

codec library: e.g. x264
- only encoder & decoder for data stream (h264)
video file container: e.g. libav
- pack/unpack video file (mp4, avi, mkv)
multi container & codec: e.g. ffmpeg
- All file formats & all codecs (avi:h264, avi:h265, mkv:h264…)

Reference: h264/h265 bit stream 分析工具，可以幫助理解檔案格式

後面附上程式碼

閱讀全文〈x264 c++ encode example〉

26 2 月, 20174 3 月, 2017

Compile x264 library with mingw-w64

完全是參考，注意 YASM 可能要另外下載丟到 mingw/bin/

Download the x264 source code
run ./configure
run make

但是如果這麼簡單就不用寫一篇文章了

MSYS 死活找不到 Mingw-w64，一直顯示找不到 gcc

直到我下載了 MSYS2 ，對，就是多個 2 而已。然後乖乖把mingw32和64放到 C:\msys64底下，執行對應的 msys 環境 (如 mingw64.exe )，就可以正常編譯了!

附上編譯好的懶人包 for mingw-w64

22 2 月, 201724 2 月, 2017

Javascript file path format extraction

輸入一組路徑，轉換成 sprintf format

如把

'D:\\\\workspace\\path-parser\\img\\img_0005.txt'

轉換為

D:\\\\workspace\\path-parser\\img\\img_%04d.txt

基本上就是個掃目錄的蠢方法，反正成功了…演算法如下，後面附上 Javascript code

把輸入路徑轉換成 regex (這邊假定是索引最後一組數字字串)
在對應目錄下用 regex 找符合的檔名，紀錄最長和最短檔名長度
如果長度相同，就是有 zero padding ( e.g. %04d )，反之沒有
根據有沒有 zero padding 決定 format 要怎寫

閱讀全文〈Javascript file path format extraction〉

16 1 月, 201716 1 月, 2017

zlib node.js example

const zlib = require('zlib');
const fs   = require('fs');

// generate data
let buf = Buffer.alloc(4 * 240 * 135);
for (let y = 0, i = 0; y < 135; ++y) {
	for (let x = 0; x < 240; ++x, ++i) {
		buf.writeFloatBE(x*y, i*4);
	}
}

// compress
let bufCompress = zlib.deflateSync( buf );

console.log( bufCompress );

// write file
let file = fs.openSync('data.z', 'w+');
fs.writeSync( file, 'test\nabc\n' );                        // write some header
fs.writeSync( file, bufCompress, 0, bufCompress.length );   // write compressed data
fs.closeSync( file );

// read file
let readBuf = fs.readFileSync('data.z');
let readArr = readBuf.toString().split('\n');
console.log( readArr[0] );
console.log( readArr[1] );
let offset = readArr[0].length + readArr[1].length + 2;
bufCompress = readBuf.slice(offset);

console.log( bufCompress );

// decompress
let bufUncompress = zlib.inflateSync( bufCompress );

// check data
for (let i = 0; i < 240*135*4; ++i) {
	if ( buf[i] != bufUncompress[i] ) {
		console.log( 'error' );
	}
}

console.log( buf.length );
console.log( bufCompress.length );
console.log( bufUncompress.length );

31 12 月, 2016

zlib example

#include <stdio.h>

#include <zlib.h>
#include <cstdlib>
#include <vector>

using namespace std;

void compressFile(const char *fileName, const vector<char> &in) {
	//we will use GZip from zlib
	gzFile gz_file;
	//open the file for writing in binary mode
	gz_file = gzopen(fileName, "wb");

	//Get the size of the stream
	unsigned long int file_size = sizeof(char) * in.size();
	//Write the size of the stream, this is needed so that we know
	//how much to read back in later
	gzwrite(gz_file, (void*) &file_size, sizeof(file_size));
	//Write the data
	gzwrite(gz_file, (void*) in.data(), file_size);
	//close the file
	gzclose(gz_file);
}

void decompressFile(const char *fileName, vector<char> &out) {
	//open the file for reading in binary mode
	gzFile gz_file = gzopen(fileName, "rb");
	//this variable will hold the size of the file
	unsigned long int size;
	//we wrote out a unsigned long int when storing the file
	//read this back in to get the size of the uncompressed data
	gzread(gz_file, (void*) &size, sizeof(size));
	//resize the string
	out.resize(size / sizeof(char));
	//read in and uncompress the entire data set at once
	gzread(gz_file, (void*) out.data(), size);
	//close the file
	gzclose(gz_file);
}

int main(int argc, char* argv[])
{
	vector<char> in, out;

	// generate data
	for (int i = 0; i < 1920*1080; ++i) {
		int d = rand();
		for (int j = 0; j < 8; ++j) {
			char d2 = (d >> j) & 0xF;
			in.push_back(d2);
		}
	}

	compressFile("com.gz", in);
	decompressFile("com.gz", out);

	// check decompressed data
	for (int i = 0; i < (int) out.size(); ++i) {
		if ( in[i] != out[i] ) {
			printf("error\n");
		}
	}
    return 0;
}

16 11 月, 2016

C programming language: Volatile is dangerous

最近真的是被 volatile 搞到了

一直以來 volatile 的用法，其實不只是單純的每次執行到該指令時從記憶體內重新讀取，不使用 CPU 內 cache，因為在執行階段記憶體的內容可能被 interrupt 更新，在經過 compiler 最佳化過後會產生意想不到的執行結果

一個經典 while loop 例子

volatile int loop = 1;
while ( loop ) {
    printf("I'm running.\n");
}

如果沒有 volatile 修飾，compiler 最佳化後的結果是不可能跳出 while 迴圈的，但事情恐怕不是只有取消 cache 記憶體讀取這麼簡單

考慮下面一個簡單的迴圈

for (int i = 0; i < 256; ++i) {
    *((volatile unsigned long *)(0x12345678)) = i;
}

我們對記憶體 0x12345678 重複寫入不同值，這在 firmware 操作經常用來對硬體填入一個巨大的陣列，但硬體只提供一個固定位置的 register interface

如果沒有 volatile 修飾，上面這段 code 可能被 compiler 最佳化成

*((unsigned long *)(0x12345678)) = 255;

意思是我們的表個僅填入最後一筆值，不會正確的填寫完整的陣列

但這並不代表我們應該取消 compiler 最佳化，或是瘋狂的在所有存取指令加入 volatile ，而是需要去了解 code implicit 意義

8 11 月, 201615 11 月, 2016

OpenMP nested parallelization

OpenMP 預設只會平行化第一階層 directives ，當被平行化的 thread 再呼叫 OpenMP 時會自動被忽略，但是我們可以強制展開巢狀階層的 directives ，對於 function 呼叫產生新的 thread 也同樣有效

兩種解法，一是在程式碼中插入omp_set_nested(1)，另外就是從環境變數著手，如下兩種都可以

第一種方法是在程式碼中呼叫 omp_set_nested ，使用需要 include omp.h

omp_set_nested(1);

第二種方法則是設定環境變數，注意環境變數的優先權比 omp_set_nested 低，所以會被覆蓋

set OMP_NESTED=TRUE

一個完整的 nested openmp parallel for 範例，這邊使用 omp_set_nested ，第一次呼叫 fun() 時會同時產生 4 個 thread，第二次呼叫 fun() 時由於沒有 nested parallelization 則是 2 個 thread

#include <omp.h>
#include <stdio.h>
#include <time.h>
#include <windows.h>
 
void fun() {
	#pragma omp parallel for
	for (int i = 0; i < 2; ++i) {
		#pragma omp parallel for
		for (int j = 0; j < 2; ++j) {
			printf("%d %d %d\n", i, j, (int)clock());
			Sleep(1000);
		}
	}
}

int main() {

	omp_set_nested(1);
	fun();
	
	omp_set_nested(0);
	fun();
	
	return 0;
}

執行結果如下，當 omp_set_nested 設為 1 時，第二層 loop 是同時被平行化的，反之則是無平行化循序執行

15 10 月, 201617 5 月, 2022

The power of camera matrix : Volumetric Reconstruction

visualltex — Visual Hull Reconstruction with texture mapping

閱讀全文〈The power of camera matrix : Volumetric Reconstruction〉

14 10 月, 201615 10 月, 2016

Traverse Clang AST: visitor

我們在前一篇 Clang AST dump 中交代了指令可以利用 clang 完整 dump AST。其實已經可以自己寫程式 traverse AST 了。但又何必再 parse 一次呢? 可以直接利用 libclang 使用建立好的 AST。

閱讀全文〈Traverse Clang AST: visitor〉

9 10 月, 20169 10 月, 2016

Build LLVM+Clang from source code with mingw

這邊不筆記一下不行，人老了沒辦法記這麼詳細了，發揮打遊戲查攻略的精神搜尋資料和嘗試錯誤!! 這也太難，光是編譯就可以讓一堆人打退堂鼓了…

我就是想在 win7 + mingw 環境下用 clang ! 官網都不提供只好自己來編譯了。

Reference

Requirement: 確定都有寫進 system path

mingw32 或 mingw32-w64
cmake
python 2 或 python 3 都可以
我的 OS: windows 7 64-bit，有人則是在 linux 系統中編譯 cross-platform target 比較好設定，網路上的步驟也幾乎都是 linux 系統指令

Downloads: 一堆 source code 還要放在特定的結構目錄下才能正常編譯

LLVM source code: 重新命名資料夾 llvm
Clang source code: 重新命名資料夾為 clang，放在 llvm/tools/
clang-tools-extra source code: (optional) 重新命名資料夾為 extra，放在 llvm/tools/clang/tools/, 這樣才有些其他 clang tool. eg. AST matcher
compiler-rt source code: (optional) 重新命名資料夾為 compiler-rt，放在資料夾 llvm/projects/
其他 source code 幾乎都是放在 llvm/projects/ 目錄下

Steps

執行 cmake-gui，設定好 source & build path
執行 configure，選 mingw build
找到 CMAKE_BUILD_TYPE，輸入值 Release 或 MinSizeRel，才可以編譯比較小的 LLVM，否則 lib大小很驚人
執行 generate，確定沒有錯誤訊息
cmd 進入 build 目錄下，執行 mingw32-make 編譯
同目錄執行 mingw32-make install 完成 install目錄 (官方文件提示要先 mingw32-make check-all，但 win7 環境下都失敗不管它了)
lib 目錄下要 include 的東西太多可以 archive 起來，之後要引入 linker 比較方便，這裡範例使用 thin archive，只打包檔名，並非真的打包所有 *.a。在 llvm/lib/ 目錄下輸入
```
ar -rcT libclang.a *.a
```
以後要 link clang 只要 link libclang.a 即可。若使用官網提供 VC++ 編譯好的 binary，則是可以在 link time 時將 libclang.lib 當成 object file 來 link 。

這樣編譯好的 clang.exe 基本上還是使用 mingw 的 standard library，需要 mingw 的 runtime 才能執行，使用 clang.exe 前確保 mingw 有設定進 system path

編譯好的懶人包