-
Notifications
You must be signed in to change notification settings - Fork 297
sync: Send a hash of all database rules at beginning and end of sync #1409
Conversation
CC_SHA256_Init(&sha); | ||
|
||
[self inDatabase:^(FMDatabase *db) { | ||
FMResultSet *rs = [db executeQuery:@"SELECT * FROM rules"]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might need to exclude transitive rules here as it'll change your hash of hashes and the server will have no idea that these exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also any rules added manually while a sync server didn't exist (though I don't think we track such thing today). However this is likely niche...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thought - there's been requests and talks in the past about per-user rules. It would be good to have a bit of an idea about how something like those might be represented here if they're ever implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmarkowsky Good call, I've excluded transitive rules
@mlw If a sync server is added the first sync afterward is a clean sync so the hashes shouldn't matter. For per-user rules those would be implemented on the server side anyway, so that shouldn't matter here. The purpose of this is not necessarily to have the server calculate the hash and compare them but to store the hash sent in a postflight and then validate that the one sent in a preflight matches the stored value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mlw I'm assuming you're talking about the rumblings in the past that folks would like to block execution by the user e.g. root can run X but not user peterm.
That'd have to be implemented in the client but is an extension to rules similar to #1200.
Re: Hash of Hashes. Do you need to sort by identifier and policy to create a stable hash? Like if I get a rule out of order for some reason i'll end up with a different hash even though the set of policies is the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to sort by identifier and policy to create a stable hash? Like if I get a rule out of order for some reason i'll end up with a different hash even though the set of policies is the same.
Right now this hashes based on insertion order, which is stable on the client at least. The server could ensure this is stable by always ordering by creation timestamp but I'm not really expecting servers to calculate this in the same way as the client but instead just check the hash doesn't change during preflight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now this hashes based on insertion order, which is stable on the client at least. The server could ensure this is stable by always ordering by creation timestamp but I'm not really expecting servers to calculate this in the same way as the client but instead just check the hash doesn't change during preflight.
That's good to know.
Not saying you should do this as the hash is probably the right tradeoff for now, but previously, I looked into using an xorfilter/bloomfilter rather than a hash to avoid worrying about ordering e.g. this single header one https://github.com/FastFilter/xor_singleheader.
The advantage to this is that you're creating a set membership data structure so order becomes irrelevant.
The process would just be:
- Hash rules (policy, state, identifier) to a 64-bit integer (e.g. use murmur3)
- Insert into the xorfilter
- Serialize and ship the xorfilter to the server in postflight
- Rehydrate the xorfilter on the server and then for all rules per user check if they're in the supplied xorfilter
Alternatively you could also receive a xorfilter from the server of all the expected rules and as you're scanning the Santa rule db check if those rules are supposed to be in the client DB. This would let you know exactly which rules were spurious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see 3 problems with that approach
- Serializing and de-serializing the filter may require reimplementing the serialization - the Go implementation in particular doesn't support this
- This sounds far more expensive on the server-side; what I've proposed requires saving a single extra field in postflight and doing a simple 64-byte string comparison during preflight. The fuse filter method would require looping over all rules in postflight to check that they're in the filter.
- This will not detect extra rules that have been added. Reversing the process would find extra rules (which is better, imo) but then not detect missing rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Late reply, but I was referring to some unimplemented per-user rule feature. (I think this has been requested multiple times, but the only reference I found quickly was: #1280 )
This may not matter much, and we should certainly not overfit to some unimplemented feature. I could see the hash-of-hashes being user-agnostic and that would likely be just fine. I was just hoping to consider it early in case there were any obvious/easy tweaks to make to this draft that would better adapt to some new, more complex rule set (per user, CEL, etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mlw Gotcha. I don't think that should be an issue, whether we implement that as an extra field in a rule or via CEL, we would add the new field(s) to those included in each rules' hash.
while ([rs next]) { | ||
char digest[128]; | ||
uint32_t len = snprintf( | ||
digest, 128, "%s:%d:%d:%u", [rs stringForColumn:@"identifier"].UTF8String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/128/sizeof(digest)/
What do you think about adding the hash of hashes value to |
No description provided.