New acoustic attack steals data from keystrokes with 95% accuracy::A team of researchers from British universities has trained a deep learning model that can steal data from keyboard keystrokes recorded using a microphone with an accuracy of 95%.

  • Coreidan@lemmy.world
    link
    fedilink
    English
    arrow-up
    191
    arrow-down
    14
    ·
    edit-2
    1 year ago

    I’ll believe it when it actually happens. Until then you can’t convince me that an algorithm can tell what letter was typed from hearing the action through a microphone.

    This sounds like absolute bullshit to me.

    The part that gets me is that the ONLY reason this works is because they first have to use a keylogger to capture the keystrokes of the target, then use that as an input to train the algorithm. If you switch out the target with someone else it no longer works.

    This process starts with using a keylogger. The fuck you need “ai” for if you have a keylogger?!? Lol.

    • Obsession@lemmy.world
      link
      fedilink
      English
      arrow-up
      51
      ·
      1 year ago

      That’s pretty much what the article says. The model needs to be trained on the target keyboard first, so you won’t just have people hacking you through a random zoom call

      • bdonvr@thelemmy.club
        link
        fedilink
        English
        arrow-up
        21
        arrow-down
        2
        ·
        1 year ago

        And if you have the access to train such a model, slipping a keylogger onto the machine would be so much easier

        • jumperalex@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          3
          ·
          1 year ago

          Hmmm not totally. A bad actor could record the keyboard and then figure out a way to get it installed. Either through a logistics attack (not everyone maintains a secure supply chain), or an insider threat installing it. Everyone’s trained not to allow thumb drives and the like. But a 100% completely unaltered bog standard keyboard brought into a building is probably easier, and for sure less suspicious if you get caught.

          Sure you might say, “but if you have an insider you’ve already lost” to which I say, your insider is at risk if they do certain things. But once this keyboard is installed, their own detection risk is less.

          Now the question is, how far away can the mic be? Because that’s gonna be suspicious AF getting that installed. BUT!!! this is still a great way to break the air gap.

          • ItsMeSpez@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            1 year ago

            A bad actor could record the keyboard and then figure out a way to get it installed

            The room is important to the training of the model as well. So even if you know the make and model of the keyboard, the exact acoustic environment it is in will still require training data.

            Also if you can install a keyboard of your choosing, you can just put the keylogger inside the keyboard. If you’re actually getting your own peripherals installed on your target machine, training a model to acoustically compromise your target is the most difficult option available to you.

            • jumperalex@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 year ago

              good point about the room.

              as for an installed keylogger, there are organizations that will inspect for that and catch it. My point is this is a way to get an actually unmolested USB device into play.

              But I hear you, this isn’t likely an ideal option right now, but it is an option for maybe some niche case. And these are early days, put enough funding behind it and it might become more viable. Or not. Mostly I’m just offering the thought that there ARE use cases if someone puts even a moment’s creative thought into trade craft and the problems it might solve like breaking the air gap, emplacement, avoiding detection, and data exfil. Each of those are problems to be solved at various levels of difficulty depending on the exact target.

    • LouNeko@lemmy.world
      link
      fedilink
      English
      arrow-up
      20
      arrow-down
      1
      ·
      1 year ago

      I think you might have misunderstood the article. In one case they used the sound input from a Zoom meeting and as a reference they used the chat messenges from set zoom meetings. No keyloggers required.

      I haven’t read the paper yet, but the article doesn’t go into detail about possible flaws. Like, how would the software differentiate between double assigned symbols on the numpad and the main rows? Does it use spell check to predict words that are not 100% conclusive? What about external keyboards? What if the distance to the microphone changes? What about backspace? People make a lot of mistakes while typing. How would the program determine if something was deleted if it doesn’t show up in the text? Etc.

      I have no doubt that under lab conditions a recognition rate of 93% is realistic, but I doubt that this is applicable in the real world. Noboby sits in a video conference quietly typing away at their keyboard. A single uttered word can throw of your whole training data. Most importantly, all video or audio call apps or programs have an activation threshold for the microphone enabled by default to save on bandwith. Typing is mostly below that threshold. Any other means of collecting the data will require you to have access to the device to a point where installing a keylogger is easier.

      • imaradio@lemmy.ca
        link
        fedilink
        English
        arrow-up
        9
        ·
        1 year ago

        It sounds like it would have to be a very targeted attack. Like if the CIA is after you this might be a concern.

            • LouNeko@lemmy.world
              link
              fedilink
              English
              arrow-up
              4
              ·
              1 year ago

              Good question. Since Zoom is mainly a buisness tool and a lot if high profile companies rely on it - if there’s even the suspicion that zoom uses collected data to steal passwords or company secrets, they will bring the hammer down in the most gruesome class action lawsuit. Companies pay good money for the buisness license and Zoom will certainly not bite the hand that feeds them.
              However, this might not apply to private Zoom users. And I’m certain that Zoom does some shady stuff behind the scenes with the data they collect on private individuals beyond simply “improving our services”.

    • Ironfist@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      8
      ·
      1 year ago

      I’m skeptical too, it sounds very hard to do with the sound alone, but lets assume that part works.

      The keylogger part could be done with a malicious website that activates the microphone and asks the user to input whatever. The site would know what you typed and how it sounded. Then that information could be used against you even when you are not in the malicious website.

      • Imgonnatrythis@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        ·
        1 year ago

        Hard to do, but with a very standard keyboard like a Mac keyboard the resonance signatures should be slightly different based on location on the board, take into account pattern recognition, relative pause length between keystrokes, and perhaps some forced training ( ie. Get them to type know words like a name and address to feed algorithm) I think it’s potentially possible.

    • barryamelton@lemmy.ml
      link
      fedilink
      English
      arrow-up
      6
      ·
      1 year ago

      it doesn’t need a keylogger. Just needs a Videocall meeting, a Discord call meanwhile you type to a public call, a recording of you on youtube streaming and demoing something… etc.

    • HankMardukas@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      2
      ·
      1 year ago

      It’s bad now, but where we’re at with AI… It’s like complaining that MS paint in 1992 couldn’t make photorealistic fake images. This will only get better, never worse. Improvements will come quickly.

    • egeres@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Is gonna sound crazy, but I think you can skip the keylogger step!

      You could make a “keystroke-sound-language-model” (so like a language model that combines various modalities, e.g, flamingo), then train that with self-supervised learning to match “audio” with “text”, and have a system where:

      • You listen to your target for a day or so, let’s say, 1000 words typed in 🤷🏻‍♂️
      • Then the model could do something akin to anchor tokens in language-to-language translation, except in this case it would be more like fixing on easy words such as “the” to give away part of the sound-to-key map. Then keep running this mapping more parts of the keyboard
      • Eventually you try to extract passwords from your recordings and maybe bingo

      I think it’s very narrow to think that, just because this research case requires a keylogger, these systems couldn’t evolve other time to combine other techniques

  • abraham_linksys@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    41
    arrow-down
    1
    ·
    1 year ago

    It looks like they only tested one keyboard from a MacBook. I’d be curious if other keyboard styles are as susceptible to the attack. It also doesn’t say how many people’s typing that they listened to. I know mine changes depending on my mood or excitement about something, I’m sure that would affect it.

  • the_beber@lemm.ee
    link
    fedilink
    English
    arrow-up
    38
    arrow-down
    1
    ·
    1 year ago

    Tangentially related: Did you know, that it‘s technically also possible to reconstruct sound via smartphone accelerometers and there‘s no restrictions on which apps can use it. Have fun with this info (:

    • Tangent5280@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      1
      ·
      1 year ago

      Reconstruct sound using smartphone accelerators? What do you mean? That accelerometers can act as speakers and produce sound? Or they can act as microphones and record sound as numerical data of vibrations etc? Can you point me to any articles or sources?

    • Aopen@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 year ago

      SpyApp is spying in background

      User thinks “why is battery draining so fast?”

      Opens battery setting

      Oh, this app shouldnt work right now

      Restricts SpyApp’s battery permissions

    • Ironfist@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      are you saying that a cellphone accelerometer can be used as a microphone? That sounds… interesting. Do you have a source?

      • Croquette@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 year ago

        I am not the person you are replying to, but if the accelerometers are sensible enough, the vibration of the voice will be picked up by the accelerometer.

        Since the sound we make when talking are periodical, it can probably easier to track that periodicity and reconstruct the sound from there.

        It’s all my (un)educated guess.

    • Lojcs@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Iirc on newer versions of Android there are restrictions on polling rate of sensor data

  • chaorace@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    30
    arrow-down
    1
    ·
    1 year ago

    laughs in custom multi-layer orthogonal layout with one-of-a-kind enclosure & artisan keycaps

    • malloc@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      ·
      1 year ago

      Only plebs type. I write all of my content in machine code with a custom compiler to translate it to QWERTY.

      NSA/CIA/DEA/Interpol/FBI still trying to decode my shitposts to this day

  • quadropiss@lemmy.world
    link
    fedilink
    English
    arrow-up
    23
    ·
    1 year ago

    You have to train it on per device + per room basis and you don’t give everything access to your microphones

    • Sloogs@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      12
      ·
      edit-2
      1 year ago

      I was just thinking, streamers might have to be careful actually — you can often both see and hear when they’re typing, so if you correlated the two you could train a key audio → key press mapping model. And then if they type a password for something, even if it’s off-screen from their stream, the audio might clue you in on what they’re typing.

    • Botree@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      Never knew my mutant blue switch keeb would come in handy one day. I’ve lubed the blue switches and added foam and tapes so now it sounds like a clicky-thocky blue-brown switches keeb.

  • randint@lemm.ee
    link
    fedilink
    English
    arrow-up
    13
    arrow-down
    1
    ·
    1 year ago

    Assuming that this does not only work on English words, this is actually really terrifying.

    • lagomorphlecture@lemm.ee
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      I have to assume it could be modified to work on any language. You just have to know the keyboard layout for the language in question do you know what to listen for. Languages with a lot of accents like French maybe could be slightly more complicated but I seriously doubt that it couldn’t be done. I’m honestly not sure how the keyboard is set up for something like Chinese with so very many characters but again if this can be done, that can be done with some dedication and know how.

      • randint@lemm.ee
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 year ago

        There are several different ways of inputting Chinese, but generally they all map 2~6 keystrokes to one or multiple Chinese characters, and then the user chooses one. I’d imagine it wouldn’t be much harder.

  • mudcrip@lemm.ee
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    1
    ·
    1 year ago

    I find this article kinda mid bc No link to og paper Article doesn’t specify what kinds of keystrokes were being detected (so title seems kind of clickbait)

    • probably not all kinds of keyboards if they only trained model on macbooks? Also no mention of kind of data used to demonstrate 95% accuracy
  • Buddahriffic@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    1
    ·
    1 year ago

    When your ADHD fidgeting and a mic attached to your head become a super power. No one can read my keystrokes!

  • 𝔼𝕩𝕦𝕤𝕚𝕒@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    2
    ·
    1 year ago

    Sweet! More man-made horrors beyond my comprehension! I sure am glad we’re investing our time into things that will never be stolen or misused!

  • Hal-5700X@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    1 year ago

    Will a faraday bag help with a phone? Seeing how it blocks connections. You can unplug desktop mics.