I am doing a project which can simplifies as the following problem:
The system allows developers to create an app, the app contains a lot of files and directories. Each files and directories have some attributes (such as country, language, vip_flag ...).
Normal users (non-developer) can access to the system and download the app. For each user, depends on his/her profiles (like country, language, operating system, vip user or not), a subset of files and directories of the app will be downloaded, and user can play the app.
Developer can update the new version for the files or directories, users just need to download updated files (no need to download all).
One solution I am approaching:
flatten all directories, keep all files in one single directories, the name of files are just hashing value of file (similar like git). The files are delivered to users by libtorrent (not all files, just some files in the directories, depend on users' profile).
The path of files are kept in database. such as: (file_id hash_value path)
example:
1 30e83c2518b0e5ee5f4de3702dec test/test.txt
2 1ae8ef251450e5gfrf4rf6u02dec a/hello.abc
new version of files are just considered as a new file whose path is the same as the old one.
each file_id has its attributes and also kept in database.
example:
1 VN,Windows
2 EN,iOS,VIP
After the files are downloaded, there will be a build scripts which read the file path from databases to generate app directories to let user play the app.
With my approach, I learn about git. So far I know every index files are under .git/objects. That likes internal warehouse of blobs and all indexed by SHAs. But, I want to learn more how git indexes and organizes files and directories and how it builds up the whole directories from index files.