Ring Doorbell Security

The Ring Doorbell has been invaluable as we travel the world. The reactions of people are often times pretty funny as the doorbell they just pressed begins talking to them and asking them to do some action in our absence. Even over our very low-bandwidth WiMax link it is usable. The most annoying part of the device, until now, is that our dog Bentley goes crazy when the device rings the multitude of devices. Even when we are abroad if he hears the phone notification he goes ballistic instinctively knowing someone is in his yard…from a few hundred/thousand miles away.

I started getting some odd alarms on my internal network Snort sensor indicating that my doorbell was attempting to use Base64 encoding for basic auth over port 80. Possibly it was just my doorbell and I reached out to support to ask:

Selection_080

Okay so with the answer from support I needed to look into things more. I started by performing a packet capture between the device and my firewall. I was able to capture the port 80 communication path without much issue. It appears that events (motion, button press, etc.) are communicated to home base over Port 80 via a JSON blob. The JSON structure is at the end of this post.

We can see from the above that the ID value is temporally consistent as the session control mechanism for the SIP server to identify the doorbell being rung. I logged into their website to see if the same JSON ID followed me there, but it seems the ID is temporal and possibly controlled as part of the authentication server instead of the button press itself. The SIP from address is always the MAC address of the device @ring.com. Having now captured multiple events, I can safely say that the JSON ID information changes between the sessions.

After the press the SIP server accepts the “push” as a call event and then sends out the group call to the various iPhones and iPads logged in to start a call as needed. It is an interesting concept to use a SIP Group to handle the communications path. Once the call is “answered” no other device can join the session so it is very much 1:1.

I find it interesting, as someone who does VOIP software, that they chose to use G.711 PCM as the media format with H.264 RTP with DynamicRTP Type 97. It seems there are much more resilient protocols you could have chosen for dealing with the latency/bandwidth constraints. Alas I digress…

Further down the SIP communication channel I finally found what was triggering my Snort alarms. There in clear text was the Basic Auth using a Base64 encoded stream.

Base64: MDAxZGM5MWUyMGEyOjZmMjhlMWM3ZGE0M2JjMTFmZjU1ZjBmMDU4MDM2NTU2AA==
Decoded: 001dc91e20a2:6f28e1c7da43bc11ff55f0f058036556

I recognised the first part of the decode as the MAC address of the device itself 00:1d:c9:1e:20:a2. The remaining 32 alphanumeric characters seem to be a MD5 hash best I can tell. I ran the MD5 hash through my usual suspect sources for collisions in the basement with no luck. My hope is that the MAC serves as the ID and the MD5 hash as the password which would be my guess. Why you would pass this over HTTP is beyond me though.

I stopped short of actually probing the API endpoint with JSON. Without allowance from the company and security team I didn’t think this prudent. Wanted to head-off the question as to why I didn’t dig deeper.

In the security world we always evaluate the vulnerability against the risk. Additionally we focus on a defence-in-depth approach to make sure that multiple protections are in place versus relying on one. For us we have multiple camera identification rings before you make it to our front door with physical access identification at the road. Within our internal network I segment the Ring doorbell device on a private VLAN to make sure any communication channel is limited to the device and its home servers versus my greater network. This, plus 802.1x on our home network, ensures that even with the unsecure authentication and settings passing, you can do little harm to us.

For this case, in our configuration, the usability outweighs the risk. I would make a few recommendations to Ring though:

  1. Switch to something like Opus for your audio encoding. This would be better for users like myself who live on poor WAN links.
  2. Move your video streaming to VP9 or something more bandwidth efficient.
  3. Once you implement more efficient audio and video codecs, you should be able to migrate your SIP sessions to TLS without much issue.
  4. Secure your authentication and JSON configuration streams with HTTPS at the very least! COME ON!

Video of Bentley flipping out with the doorbell:

JSON blob:

JavaScript Object Notation: application/json
    Object
        Member Key: "motion"
            Object
                Member Key: "id"
                    String value: 640791565
                Member Key: "state"
                    String value: ringing
                Member Key: "motion_snooze"
                    Number value: 2
                Member Key: "sip_server_ip"
                    String value: 52.23.89.147
                Member Key: "sip_server_port"
                    String value: 15063
                Member Key: "sip_server_tls"
                    String value: false
                Member Key: "sip_session_id"
                    String value: 665021697-1469363028
                Member Key: "sip_server_tls_port"
                    String value: 15064
                Member Key: "sip_from"
                    String value: sip:001dc91e20a2@ring.com
                Member Key: "sip_to"
                    String value: sip:665021697-1469363028@52.23.89.147
                Member Key: "button_press_path"
                    String value: /doorbots_api/motions/640791565/button_pressed
                Member Key: "mic_volume"
                    Number value: 11
                Member Key: "voice_volume"
                    Number value: 11
                Member Key: "stream_profile"
                    Number value: 2
                Member Key: "udp_ping_server"
                    Null value
                Member Key: "udp_ping_port"
                    Null value
                Member Key: "enable_recording"
                    Number value: 1
        Member Key: "settings"
            Object
                Member Key: "utc_offset"
                    String value: -04:00
                Member Key: "keep_alive"
                    Number value: 15
                Member Key: "doorbell_volume"
                    Number value: 8
                Member Key: "enable_chime"
                    Number value: 1
                Member Key: "enable_vod"
                    Number value: 0
                Member Key: "exposure_control"
                    Number value: 2
                Member Key: "theft_alarm_enable"
                    Number value: 0
                Member Key: "pir_sensitivity_1"
                    Number value: 10
                Member Key: "pir_sensitivity_2"
                    Number value: 5
                Member Key: "pir_sensitivity_3"
                    Number value: 5
                Member Key: "pir_zone_enable"
                    Number value: 7
                Member Key: "use_cached_domain"
                    Number value: 0
                Member Key: "use_server_ip"
                    Number value: 0
                Member Key: "server_domain"
                    String value: fw.ring.com
                Member Key: "server_ip"
                    Null value
                Member Key: "enable_log"
                    Number value: 1
                Member Key: "keep_alive_ms"
                    Number value: 15000