How does LSTM network know when to forget? - artificial-intelligence

How does a LSTM network know when is a good time to forget the dependencies it has learned?

Related

Making an energy recovery ventilator - practical to make it Matter enabled?

I posted something much like this on openhab and home assistant forum too, I will decide what to do based on what I hear...
I am trying to produce an open source Energy Recovery Ventilator, and software is not my forte.
I frankly find the sheer variety and quantity of buzzwords and subsystems in the home automation sphere difficult to navigate. I am unclear on why exactly things have to be so complicated... anyway.
I am using a raspberry pi pico running micropython. Do you think it would be practical to make it appear as basically a fan with several different modes to a matter hub? Maybe report back some info so the user can see some status updates etc. ?
What I want is basically to allow it to be controlled by a hub, which may be running on a phone or someone’s PC, so the hub’s user interface etc. Can be used to make the device turn off an on, up and down on a schedule, it can be connected to other devices like a co2 detector, smart switch, etc.
I need, sooner or later, possibly with the help of module(s) running on the pico to cache data (like schedule data) get the time or whatever, a dictionary which I will use for the rest of the system to interface to. The main loop consults the dictionary to determine behaviour at any given moment. The hub checks what time of day it is, etc. And sends that info along.
Is this sort of thing doable?
I tried to look into making the thing Alexa compatible and ye gads it would take me months to get that stuff working. They make everything so complicated
Found some stuff for esp32 devices like esphome, but it is not practical to use as a module in a larger system. MQTT looks like it could play an important role, but doesn't quite get me there and for some reason Alexa, Google home etc still cant really talk to mqtt devices very well, esp. including device setup etc. Basically, envision a little hardware device that just serves up some fields and takes back some fields, then appears as a device to Google home's app etc. I need that, but a software module that runs on a pico. Is it practical to roll this or is it going to be an ungainly undertaking?

"Standard" approach to collecting data from/distributing data to multiple devices/servers?

I'll start with the scenario I am most interested in:
We have multiple devices (2 - 10) which all need to know about
a growing set of data (thousands to hundreds of thousands of small chunks,
say 100 - 1000 bytes each).
Data can be generated on any device and we
want every device to be able to get all the data (edit: ..eventually. devices are not connected and/or online all the time, but they synchronize now and then) No data needs
to be deleted or modified.
There are of course a few naive approaches to handle this, but I think
they all have some major drawbacks. Naively sending everything I
have to everyone else will lead to poor performance with lots of old data
being sent again and again. Sending an inventory first and then letting
other devices request what they are missing won't do much good for small
data. So maybe having each device remember when and who they talked to
could be a worthwhile tradeoff? As long as the number of partners
is relatively small saving the date of our last sync does not use that much
space, but it should be easy to just send what has been added since then.
But that's all just conjecture.
This could be a very broad
topic and I am also interested in the problem as a whole: (Decentralized) version control probably does something similar
to what I want, as does a piece of
software syncing photos from a users smart phone, tablet and camera to an online
storage, and so on.
Somehow they're all different though, and there are many factors like data size, bandwith, consistency requirements, processing power or how many devices have aggregated new data between syncs, to keep in mind, so what is the theory about this?
Where do I have to look to find
papers and such about what works and what doesn't, or is each case just so much
different from all the others that there are no good all round solutions?
Clarification: I'm not looking for ready made software solutions/products. It's more like the question what search algorithm to use to find paths in a graph. Computer science books will probably tell you it depends on the features of the graph (directed? weighted? hypergraph? euclidian?) or whether you will eventually need every possible path or just a few. There are different algorithms for whatever you need. I also considered posting this question on https://cs.stackexchange.com/.
In your situation, I would investigate a messaging service that implements the AMQP standard such as RabbitMQ or OpenAMQ, each time a new chunk is emitted, it should be sent to the AMQP broker which will broadcast it to all devices queues. Then the message may be pushed to the consumers or pulled from the queue.
You can also consider Kafka for data streaming from several producers to several consumers. Other possibility is ZeroMQ. It depends on your specific needs
Have you considered using Amazon Simple notification service to solve this problem?
You can create a topic for each group of device you want to keep in sync. Whenever there is an update in dataset, the device can publish to the topic which in turn will be pushed to all devices using SNS.

How to design a secure & light-weighted protocol between server and the mobile app?

I'm in a dilemma while designing a communication protocol between server and the iOS/Android clients.
I'll like to make the protocol:
secure:since we would send important messages
light-weighted: to have the iOS/android clients responsive for better user experience.
I've even drafted a solution with some encryption/encoding/compressing algorithms over the HTTP REST interfaces. In brief, use DH algorithm or RSA to secure the transmission of symmetric key for later encryption of the messages in the session. A bit like the iOS official example CryptoExercise
However, while I searched similar questions in Stackoverflow, I found almost all suggestions are 'using HTTPS' rather than 're-invent the wheel by yourself'.
I agreed that it's not wise to re-invent the wheel, but I'm not sure if HTTPS can work fine in the poor network conditions, which is typically faced by mobile App?
HTTPS will transmit keys every time the connection establishes. Is it really suitable for the HTTP REST API which is quick and has just small payloads?
Any other faults?
Thanks in advance for your replies.
...or RSA to secure the transmission of symmetric key
You should probably stick with cipher suites that provide forward secrecy. Its standard practice to use ephemeral key exchanges like DHE and ECDHE. And I believe the TLS Workging Group is deprecating RSA key transport in TLS 1.3. See TLS: Clarifications and questions: TLS1.3 - Static RSA and AEAD from the TLS WG.
light-weighted: to have the iOS/android clients responsive for better user experience
The pain point is key exchange, and you probably can't avoid it.
For bulk encryption, perhaps you should use the new ChaCha/Poly1305 cipher suites, like TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305, TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305, TLS_DHE_RSA_WITH_CHACHA20_POLY1305 and TLS_RSA_WITH_CHACHA20_POLY1305. They are 4x or so faster. See Speeding up and strengthening HTTPS connections for Chrome on Android.
I've even drafted a solution with some encryption/encoding/compressing algorithms over the HTTP REST interfaces
WebCrypto is months away from standardizing something. After that, the browsers have to adopt it. So you probably won't have primitives for months to years.
For the custom DH, you will lack the BigIntegers. Even when WebCrypto publishes its first standard, it will not include BigIntegers. See WebCrypto: Question on BigInteger operations. (By the way, I have a custom DH scheme for multifactor authentication. Its why I needed those BigIntegers, too).
I've even drafted a solution with some encryption/encoding/compressing algorithms
I hope you realize compression leaks information in the higher layers. See, for example, Rizzo and Duong's CRIME Attack on HTTP and SPDY protocols.
However, while I searched similar questions in Stackoverflow, I found almost all suggestions are 'using HTTPS' rather than 're-invent the wheel by yourself'.
You should probably be on Google Scholar looking at articles on mTCP and mUDP, mobile VPN, mobile TLS, wireless TLS, etc.
Encryption is not going to be your problem. Quick recovery is going to be your problem.
I agreed that it's not wise to re-invent the wheel, but I'm not sure if HTTPS can work fine in the poor network conditions...
Well, its like Dr. Jon Bentley said: If it doesn't have to be correct, I can make it as fast as you'd like it to be. You should probably stick with SSL/TLS, harden it where deficient, and then speed it up with choice in cipher suites.
if HTTPS can work fine in the poor network conditions, which is typically faced by mobile ApP
Mobile is not that bad in practice. There's nothing you can do about dropping a connection. There's nothing you can do if your IP is not transferred when roaming. And there's nothing you can do about the device's ConnectionManager claiming the write succeeded even when you know you have no coverage.
About all you can do is ensure fast recovery.
Maybe you should look into UDP and DTLS.
Is it [HTTPS] really suitable for the HTTP REST API which is quick and has just small payloads?
See Bentley's quote.
And don't use that damn browser security model. Its a curse. (Unless, of course, Diginotar- and Trustwave-like failures are acceptable in your models).
I made a test to compare the HTTP and HTTPS performance in 'typical' network conditions:
use 3G network with an android phone
Test at home, then test in the way office, with walk, train, and bus
In each round of tests, use http and https to fetch a json string of 700 Bytes alternatively
TEST RESULT:
HTTP calls: 2138 times of success, 13 failures, average fetch time: 492 miliseconds
HTTPS calls: 1957 times of success, 5 failures, average fetch time: 778 miliseconds
The test result shows to fetch a small payload:
1. HTTPS is as stable as HTTP
2. The main shortage is latency, 58% slower than HTTP
I think it's acceptable, so I'm considering to turn to HTTPS.
After all, though it's not difficult to invent a wheel, but it's not easy to invent one BETTER ENOUGH, just as #jww 's comments.
I will continue to make another test to see how HTTPS performance with a larger payload.
Thank you all.

Protecting crypto keys in RAM?

is there any way to protect encryption keys that are being stored in RAM from a freezer attack? (Sticking the computer in a freezer before rebooting malicious code to access the contents of RAM)
This seems to be a legitimate issue with security in my application.
EDIT: it's also worth mentioning that I will probably be making a proof of concept OS to do this on the bare metal, so keep in mind that the fewer dependencies, the better. However, TRESOR does sound really interesting, and I might port the source code of that to my proof of concept OS if it looks manageable, but I'm open to other solutions (even ones with heavy dependencies).
You could use something like the TRESOR Linux kernel patch to keep the key inside ring 0 (the highest privilege level) CPU debug registers only, which when combined with an Intel CPU that supports the AES-NI instruction, doesn't need to result in a performance penalty (despite the need for key recalculation) compared to a generic encryption implementation.
There is no programmatical way. You can not stop an attacker from freezing your computer and removing the RAM chips for analysis.
If someone gains access to your hardware - everything you have on it is in the hands of the attacker.
Always keep in mind:
http://cdn.howtogeek.com/wp-content/uploads/2013/03/xkcd-security.png
As Sergey points out, you cannot stop someone from attacking the RAM if the hardware is in their possession. The only possible solution to defend hardware is with a tamper resistant hardware security module. There are a couple of varieties on the market: TPM chips and Smart Cards come to mind. Smart cards may work better for you because the user should remove them from the device when they walk away, and you can simply erase the keys when the card is removed.
I would do a bit more risk analysis that would help you figure out how likely a frozen RAM attack is. Which computers are most at risk of being stolen? Laptops, servers, tablets, or smart phones? What value can your attackers possibly get from a stolen computer? Are you looking to keep them from decrypting an encrypted disk image? From recovering a document that's currently loaded in RAM? From recovering a key that would lead to decrypting an entire disk? From recovering credentials that would provide insider access to your network?
If the risks are really that high but you have a business need for remote access, consider keeping the secrets only on the secured corporate servers, and allowing only browser access to them. Use two factor authentication, such as a hardware access token. Perhaps you then require the remote machines to be booted only from read-only media and read-only bookmark lists to help ensure against viruses and other browser based attacks.
If you can put a monetary value on the risk, you should be able to justify the additional infrastructure needed to defend the data.

Decision making in distributed applications

With a distributed application, where you have lots of clients and one main server, should you:
Make the clients dumb and the server smart: clients are fast and non-invasive. Business rules are needed in only 1 place
Make the clients smart and the server dumb: take as much load as possible off of the server
Additional info:
Clients collect tons of data about the computer they are on. The server must analyze all of this info to determine the health of these computers
The owners of the client computers are temperamental and will shut down the clients if the client starts to consume too many resources (thus negating the purpose of the distributed app in helping diagnose problems)
You should do as much client-side processing as possible. This will enable your application to scale better than doing processing server-side. To solve your temperamental user problem, you could look into making your client processes run at a very low priority so there's no noticeable decrease in performance on the part of the user.
In a client-server setting, if you care about security, you should always program on the assumption that the client may have been compromised. Even if it hasn't, there is always the risk of somebody using an old version of the client, using a competing or modified version of the client, or just of the net connection being a bit screwy.
So while you do as much work on the client as possible, processing and marshalling information into the right form, the server then needs to do a thorough sanity check on anything the client gives it.
So the answer I guess is "both".
The server must analyze all of this
info to determine the health of these
computers
That is probably the biggest clue so far explaning what your application is kinda about. Are you able to provide a more elaborate briefing on what this application is seeking to achieve in this distributed environment? We do not even know if the client-side processing is disk I/O or processor intensive. How you design the solution is dependent on the nature of what needs to be done to help the users/business accomplish their jobs and objectives.

Resources