Docs

author: Christian Mollekopf <chrigi_1@fastmail.fm> 2015-12-23 12:18:54 +0100
committer: Christian Mollekopf <chrigi_1@fastmail.fm> 2015-12-23 12:18:54 +0100
commit: 4210e5a24f700e5104333ed014c3269262b66bec (patch)
tree: 15bd4881d67b18265729fb2fa25fed0af6cf56af /docs/storage.md
parent: b259728a4f63e022526ef86e6b5d6c62d9938d13 (diff)
download: sink-4210e5a24f700e5104333ed014c3269262b66bec.tar.gz
sink-4210e5a24f700e5104333ed014c3269262b66bec.zip
1 files changed, 37 insertions, 3 deletions
diff --git a/docs/storage.md b/docs/storage.md
index 26469a7..b6d73fe 100644
--- a/docs/storage.md
+++ b/docs/storage.md
@@ -86,7 +86,7 @@ Storage is split up in multiple named databases that reside in the same database
 ```
 The resource can be effectively removed from disk (besides configuration),
-by deleting the `$RESOURCE_IDENTIFIER` directory and everything it contains.
+by deleting the directories matching `$RESOURCE_IDENTIFIER*` and everything they contain.
 #### Design Considerations
 * The stores are split by buffertype, so a full scan (which is done by type), doesn't require filtering by type first. The downside is that an additional lookup is required to get from revision to the data.
@@ -95,9 +95,43 @@ by deleting the `$RESOURCE_IDENTIFIER` directory and everything it contains.
 Every operation (create/delete/modify), leads to a new revision. The revision is an ever increasing number for the complete store.
 #### Design Considerations
-By having one revision for the complete store instead of one per type, the change replay always works across all types. This is especially important in the write-back
+By having one revision for the complete store instead of one per type, the change replay always works across all types. This is especially important in the write-back mechanism that replays the changes to the source.
-mechanism that replays the changes to the source.
+### BLOB properties
+Files are used to handle opaque large properties that should not end up in memory. BLOB properties are in their nature never queriable (extract parts of it to other properties if indexes are required).
+For reading:
+Resources...
+* store the file in ~/akonadi2/storage/$RESOURCE_IDENTIFIER_files/
+* store the filename in the blob property.
+* delete the file when the corresponding entity is deleted.
+Queries...
+* Copy the requested property to /tmp/akonadi2/client_files/ and provide the path in the property
+* The file is guaranteed to exist for the lifetime of the query result.
+Clients..
+* Load the file from disk and use it as they wish (moving is fine too)
+For writing:
+Clients..
+* Request a path from akonadi2 and store the file there.
+* Store the path of the written file in the property.
+Resources..
+* move the file to ~/akonadi2/storage/$RESOURCE_IDENTIFIER_files/
+* store the new path in the entity
+#### Design Considerations
+Using regular files as the interface has the advantages:
+* Existing mechanisms can be used to stream data directly to disk.
+* The necessary file operations can be efficiently handled depending on OS and filesystem.
+* We avoid reinventing the wheel.
+The copy is necessary to guarantee that the file remains for the client/resource even if the resource removes the file on it's side as part of a sync.
+The copy could be optimized by using hardlinks, which is not a portable solution though. For some next-gen copy-on-write filesystems copying is a very cheap operation.
 ### Database choice
 By design we're interested in key-value stores or perhaps document databases. This is because a fixed schema is not useful for this design, which makes
author	Christian Mollekopf <chrigi_1@fastmail.fm>	2015-12-23 12:18:54 +0100
committer	Christian Mollekopf <chrigi_1@fastmail.fm>	2015-12-23 12:18:54 +0100
commit	4210e5a24f700e5104333ed014c3269262b66bec (patch)
tree	15bd4881d67b18265729fb2fa25fed0af6cf56af /docs/storage.md
parent	b259728a4f63e022526ef86e6b5d6c62d9938d13 (diff)
download	sink-4210e5a24f700e5104333ed014c3269262b66bec.tar.gz sink-4210e5a24f700e5104333ed014c3269262b66bec.zip

diff --git a/docs/storage.md b/docs/storage.md index 26469a7..b6d73fe 100644 --- a/docs/storage.md +++ b/docs/storage.md
@@ -86,7 +86,7 @@ Storage is split up in multiple named databases that reside in the same database
86	```	86	```
87		87
88	The resource can be effectively removed from disk (besides configuration),	88	The resource can be effectively removed from disk (besides configuration),
89	by deleting the `$RESOURCE_IDENTIFIER` directory and everything it contains.	89	by deleting the directories matching `$RESOURCE_IDENTIFIER*` and everything they contain.
90		90
91	#### Design Considerations	91	#### Design Considerations
92	* The stores are split by buffertype, so a full scan (which is done by type), doesn't require filtering by type first. The downside is that an additional lookup is required to get from revision to the data.	92	* The stores are split by buffertype, so a full scan (which is done by type), doesn't require filtering by type first. The downside is that an additional lookup is required to get from revision to the data.
@@ -95,9 +95,43 @@ by deleting the `$RESOURCE_IDENTIFIER` directory and everything it contains.
95	Every operation (create/delete/modify), leads to a new revision. The revision is an ever increasing number for the complete store.	95	Every operation (create/delete/modify), leads to a new revision. The revision is an ever increasing number for the complete store.
96		96
97	#### Design Considerations	97	#### Design Considerations
98	By having one revision for the complete store instead of one per type, the change replay always works across all types. This is especially important in the write-back	98	By having one revision for the complete store instead of one per type, the change replay always works across all types. This is especially important in the write-back mechanism that replays the changes to the source.
99	mechanism that replays the changes to the source.
100		99
		100	### BLOB properties
		101	Files are used to handle opaque large properties that should not end up in memory. BLOB properties are in their nature never queriable (extract parts of it to other properties if indexes are required).
		102
		103	For reading:
		104
		105	Resources...
		106	* store the file in ~/akonadi2/storage/$RESOURCE_IDENTIFIER_files/
		107	* store the filename in the blob property.
		108	* delete the file when the corresponding entity is deleted.
		109
		110	Queries...
		111	* Copy the requested property to /tmp/akonadi2/client_files/ and provide the path in the property
		112	* The file is guaranteed to exist for the lifetime of the query result.
		113
		114	Clients..
		115	* Load the file from disk and use it as they wish (moving is fine too)
		116
		117	For writing:
		118
		119	Clients..
		120	* Request a path from akonadi2 and store the file there.
		121	* Store the path of the written file in the property.
		122
		123	Resources..
		124	* move the file to ~/akonadi2/storage/$RESOURCE_IDENTIFIER_files/
		125	* store the new path in the entity
		126
		127	#### Design Considerations
		128	Using regular files as the interface has the advantages:
		129	* Existing mechanisms can be used to stream data directly to disk.
		130	* The necessary file operations can be efficiently handled depending on OS and filesystem.
		131	* We avoid reinventing the wheel.
		132
		133	The copy is necessary to guarantee that the file remains for the client/resource even if the resource removes the file on it's side as part of a sync.
		134	The copy could be optimized by using hardlinks, which is not a portable solution though. For some next-gen copy-on-write filesystems copying is a very cheap operation.
101		135
102	### Database choice	136	### Database choice
103	By design we're interested in key-value stores or perhaps document databases. This is because a fixed schema is not useful for this design, which makes	137	By design we're interested in key-value stores or perhaps document databases. This is because a fixed schema is not useful for this design, which makes