Hey Sanne!

Comments inlined.

Thanks
Sebastian

On Fri, Nov 25, 2016 at 2:55 PM, Sanne Grinovero <sanne@infinispan.org> wrote:
Hi Sebastian,
you're opening a very complex (but interesting!) topic.

As the paper you linked to also reminds, it's extremely hard to
implement such a thing without "giving away" lots of useful metadata
to a potential attacker. It's an interesting paper as they propose a
technique to maintain query capabilities while not having the full
data readability, yet as other papers which I've seen before it's both
complex to implement, and leaves some questions unanswered; in this
case they seem to "just" not being able to camouflage the data access
patterns, which is pretty good but according to some experts really
not enough to keep the decryption keys safe.

The typical problem is that if the server has no clue about the
encrypted blobs at all we won't be able to query it. However there's
ongoing research (like this one?) about being still able to run
queries on behalf of key-owning clients, identify a subset of the
data, e.g. a *naive* example: if you know the data structure and can
tell which section contains the "encrypted surname", then a client
could query for identical matches on the "encrypted surname"; however
this naive approach is critically flawed such as you might be able to
extract the encryption keys by analysing the statistical frequency of
signatures and run a dictionary attack, e.g. you might have a good
guess about which surname is expected to be the most commonly used.
You'll need salting techniques combined within the query capabilities,
e.g. MAC (message authentication codes) but these either require you
to trust the database (are we going in circles?) or expose you to
other forms of attack.

Yes, you are correct. Not being able to query the server is a very serious problem. But preventing a potential attacker from analyzing your communication seems very easy to be solved - just use TLS to encrypt connection between the client and the server.

So I think the main challenge is how to perform a search operation through an encrypted data set...
 

While it's obvious that this introduces some limitations on search
capabilities on the fields of the value, you might also have similar
problems just on the keys. For example you might not be able to use
any form of affinity which takes advantage of some domain specific
knowledge, or just about do anything useful beyond the pure
"key/value" capabilities which are extremely limited.
Besides, even the fact that the "key" doesn't change over time might
be critical: it means you can't use salting on the key, which again
introduces dictionary attacks by merely observing the frequency of
operations.

Even if you're prepared to give up on all those features and accept
some limitations to just encrypt it all on the client, the "grid"
needs nevertheless to be considered a trusted party; given the large
amount of data and access patterns, the data grid has so much insight
on both data and access patterns, that I doubt it can be properly
secured.

Granted. If a potential attacker had access to the machine hosting an Infinispan Server (e.g. could do a memory snapshot), the encryption algorithm would need to "survive" statistical analysis.
 

I'm not sure we have the right engineering skills to develop such a
system, we'd need at least to brush up on existing research in this
field, of which I'm not aware there being any "full solution" unless
you give a good amount of trust to the database..

There's a database called CryptDB: http://bristolcrypto.blogspot.com/2013/11/how-to-search-on-encrypted-data-in.html

I haven't looked into the research papers yet but if we had to trust any database we should pick something like that.
 

I'd love it if someone could explore this more, but be aware that it's
not as easy as just enabling encryption on the client.

I totally agree. Thanks a lot for pointing all those useful aspects!
 

Thanks,
Sanne




On 25 November 2016 at 12:32, Sebastian Laskawiec <slaskawi@redhat.com> wrote:
> Hey!
>
> A while ago I stumbled upon [1]. The article talks about encrypting data
> before they reach the server, so that the server doesn't know how to decrypt
> it. This makes the data more secure.
>
> The idea is definitely not new and I have been asked about something similar
> several times during local JUGs meetups (in my area there are lots of
> payments organizations who might be interested in this).
>
> Of course, this can be easily done inside an app, so that it encrypts the
> data and passes a byte array to the Hot Rod Client. I'm just thinking about
> making it a bit easier and adding a default encryption/decryption mechanism
> to the Hot Rod client.
>
> What do you think? Does it make sense?
>
> Thanks
> Sebastian
>
> [1] https://eprint.iacr.org/2016/920.pdf
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev