Amazon EC2 metadata - Python library and CLI | TurnKey GNU/Linux
You are here
Category: All
Amazon EC2 metadata - Python library and CLI
Blog Tags:
aws
ec2
cloud
Amazon EC2 metadata - Python library and CLI
Alon Swartz
- Wed, 2010/03/24 - 12:35 -
6 comments
Each Amazon EC2 instance has associated metadata, as well as user data supplied when launching the instance. The meta and user data is instance-specific, and therefore only accessible to the instance.
The data is useful on several levels, such as configuring SSH public keys, programmatically configuring the instance according to certain criteria, or even executing user supplied initialization scripts.
Retrieving the data
Retrieving the data is done by querying an Amazon web server with the base URI of
. The available API versions can be queried by performing a GET request on
. The latest version of the API is always available using the URI
There is quite a lot of information available through the API, some more useful than others. For example,
ami-id
ami-launch-index
availability-zone
instance-id
public-ipv4
user-data
, ... (see below for the full list).
Some notes on user data
One of the most useful pieces of data is user-data, which can be used to pass configuration information or even initialization scripts to the instance upon launch.
User data must be base64 encoded, and is limited to 16k (pre-encoding). The popular API tools usually handle the encoding transparently, so you shouldn't have to worry about it. The data is also decoded before presented to the instance, so again, you shouldn't need to worry.
What you do need to worry about though, or at least be aware of, is security.
The user-data (and all metadata for that matter) can be accessed by any user or process on the instance. So please, please, do not specify any secret information in user-data unless you are absolutely sure what you are doing. Even then, I'd think twice.
But
, you say,
I trust all my users and processes
. OK, how about this (thanks go to Eric Hammond for this example). You run a website that allows users to upload files by specifying a URL. The user specifies
, and lo-and-behold, your user-data and any secrets included have been divulged.
Do you still want to include secrets in user-data?
The simple way
The simplest way of retrieving metadata is by use of a command line network tool, such as curl, for example:
curl http://169.254.169.254/latest/meta-data/public-ipv4
The more programmatic way
Usually you need a more programmatic type interface, and there are a couple of libraries for different languages available. I didn't find one that met my needs, so I wrote one in Python called
ec2metadata.py
I licensed the copyright over to Canonical so it could be included in Ubuntu's ec2-init package.
ec2metadata.py has a CLI interface, as well as a Pythonic interface:
$ ec2metadata.py # all options will be displayed
$ ec2metadata.py --instance-id # displays the instance id
import ec2metadata
instanceid = ec2metadata.get('instance-id')
print instanceid
It can be very useful when coupled with
inithooks
, for example, setting of the SSH public keys on first boot.
#!/usr/bin/python
# Query and display EC2 metadata related to the AMI instance
# Copyright (c) 2009 Canonical Ltd. (Canonical Contributor Agreement 2.5)
# Author: Alon Swartz
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program. If not, see
"""
Query and display EC2 metadata
If no options are provided, all options will be displayed
Options:
-h --help show this help
--kernel-id display the kernel id
--ramdisk-id display the ramdisk id
--reservation-id display the reservation id
--ami-id display the ami id
--ami-launch-index display the ami launch index
--ami-manifest-path display the ami manifest path
--ancestor-ami-id display the ami ancestor id
--product-codes display the ami associated product codes
--availability-zone display the ami placement zone
--instance-id display the instance id
--instance-type display the instance type
--local-hostname display the local hostname
--public-hostname display the public hostname
--local-ipv4 display the local ipv4 ip address
--public-ipv4 display the public ipv4 ip address
--block-device-mapping display the block device id
--security-groups display the security groups
--public-keys display the openssh public keys
--user-data display the user data (not actually metadata)
"""
import sys
import time
import getopt
import urllib
import socket
METAOPTS = ['ami-id', 'ami-launch-index', 'ami-manifest-path',
'ancestor-ami-id', 'availability-zone', 'block-device-mapping',
'instance-id', 'instance-type', 'local-hostname', 'local-ipv4',
'kernel-id', 'product-codes', 'public-hostname', 'public-ipv4',
'public-keys', 'ramdisk-id', 'reserveration-id', 'security-groups',
'user-data']
class Error(Exception):
pass
class EC2Metadata:
"""Class for querying metadata from EC2"""
def __init__(self, addr='169.254.169.254', api='2008-02-01'):
self.addr = addr
self.api = api
if not self._test_connectivity(self.addr, 80):
raise Error("could not establish connection to: %s" % self.addr)
@staticmethod
def _test_connectivity(addr, port):
for i in range(6):
s = socket.socket()
try:
s.connect((addr, port))
s.close()
return True
except socket.error, e:
time.sleep(1)
return False
def _get(self, uri):
url = 'http://%s/%s/%s/' % (self.addr, self.api, uri)
value = urllib.urlopen(url).read()
if "404 - Not Found" in value:
return None
return value
def get(self, metaopt):
"""return value of metaopt"""
if metaopt not in METAOPTS:
raise Error('unknown metaopt', metaopt, METAOPTS)
if metaopt == 'availability-zone':
return self._get('meta-data/placement/availability-zone')
if metaopt == 'public-keys':
data = self._get('meta-data/public-keys')
keyids = [ line.split('=')[0] for line in data.splitlines() ]
public_keys = []
for keyid in keyids:
uri = 'meta-data/public-keys/%d/openssh-key' % int(keyid)
public_keys.append(self._get(uri).rstrip())
return public_keys
if metaopt == 'user-data':
return self._get('user-data')
return self._get('meta-data/' + metaopt)
def get(metaopt):
"""primitive: return value of metaopt"""
m = EC2Metadata()
return m.get(metaopt)
def display(metaopts, prefix=False):
"""primitive: display metaopts (list) values with optional prefix"""
m = EC2Metadata()
for metaopt in metaopts:
value = m.get(metaopt)
if not value:
value = "unavailable"
if prefix:
print "%s: %s" % (metaopt, value)
else:
print value
def usage(s=None):
"""display usage and exit"""
if s:
print >> sys.stderr, "Error:", s
print >> sys.stderr, "Syntax: %s [options]" % sys.argv[0]
print >> sys.stderr, __doc__
sys.exit(1)
def main():
"""handle cli options"""
try:
getopt_metaopts = METAOPTS[:]
getopt_metaopts.append('help')
opts, args = getopt.gnu_getopt(sys.argv[1:], "h", getopt_metaopts)
except getopt.GetoptError, e:
usage(e)
if len(opts) == 0:
display(METAOPTS, prefix=True)
return
metaopts = []
for opt, val in opts:
if opt in ('-h', '--help'):
usage()
metaopts.append(opt.replace('--', ''))
display(metaopts)
if __name__ == "__main__":
main()
Comments
Hi - The boto library
Mitch Garnaat - Wed, 2010/03/24 - 04:04
Hi -
The boto library (
) also provides a couple of methods to access instance metadata and userdata. They are boto.utils.get_instance_metadata and boto.utils.get_instance_userdata. I implement a retry mechanism because, as you note, sometimes the interface is available yet if you try to run this on startup.
Mitch
reply
Boto is excellent
Alon Swartz
- Wed, 2010/03/24 - 04:35
Thanks for link Mitch, I've been using boto since 2007, keep up the great work.
With regards to the above code, we needed a simple way for instances to access metadata, both from the CLI (mostly for testing and debugging), and from other Python scripts. I could have hooked into boto.utils, but I wanted to keep it simple, with a clear cut interface for other projects (e.g., ec2-init).
Regarding retries, that's the point of the
_test_connectivity
function.
reply
user data security is not as bad as stated
Scott Moser - Thu, 2010/03/25 - 08:04
User data security is not as bad as stated above. It can be made to be as secure as a root owned file with 400 permissions on it by routing the data service off once you've collected the information you need. This can be done with:
route add -host 169.254.169.254 reject
Once that is done, no user space process can get at the service. In order to do so, it would have to have compromised root and run:
route del -host 169.254.169.254 reject
The Ubuntu lucid images can do this for you if you use 'cloud-config' syntax. A user-data with the following will have cloud-init route the service off for you early in boot.
#cloud-config
disable_ec2_metadata: true
For more information, see [1], or Try a lucid image [2]
Doing this, obviously breaks things that depend on the instance data being there, but that can be overcome by caching the data to root-owned files. I'm not saying that user-data is excellent way to store important credentials, but it isn't as bad as it is often made out to be.
--
[1]
[2]
reply
Seems like a good idea
Alon Swartz
- Thu, 2010/03/25 - 09:20
Thanks for the link to cloud-init (I see ec2-init has changed its name).
I thought about blocking the metadata IP, it seems like a good idea, but I'm just not sure how Amazon have setup its security (is it an actual machine? running on the host?). I wonder how easy it would be to bypass using some sort of mitm and IP spoofing.
Chances are, that blocking the 169.254 IP for incoming and outgoing would be sufficient, but, as you mentioned, user-data is not the ideal place to store secret information.
reply
IAM Roles for EC2 Instances
Rob Oliver - Tue, 2012/12/11 - 12:20
I had been using this suggestion of blocking meta data as part of my user-data script, but it seems like you have to expose the meta data now if you want to use IAM Roles for EC2 Instances, which is now highly recommended by AWS (instead of embedding long-term persisted credentials on the instance).
One of the security recommendations AWS proposed during the re:Invent conference was to use a bastion host for all your EC2 instances, and log all activity.
reply
Finally on github and in the package archive
Alon Swartz
- Mon, 2013/04/22 - 09:37
I finally got around to creating an ec2metadata package (uploaded to turnkey archive) and uploading it to
github
(only took me 3 years). While I was at it I did some refactoring and split the cli and lib.
reply
Pages
Add new comment
Apps
Specials
Web development
Framework
Stack
IT Infrastructure
Content management
Blogging
Ecommerce
Education
Wiki
Media
Business management
CRM
ERP
Invoicing
Messaging
Email
Forum
Chat
Issue tracking
Project management
Database
NoSQL
Developer tools
Help
Forums
Support
General
Development
Documentation
Security and News Announcements
Low-traffic newsletter: up to one email a month.
Previous issues
Categories
development
news
appliances
community
debian
release
hub
stable
iso
security
cloud
ec2
aws
proxmox
lxc
ubuntu
tkldev
tips
v16.x
drupal
More tags
Recent posts
Free up disk space
4th Aug, 2024
Python PEP 668 - working with "externally managed environment"
29th Jul, 2024
v18.0 Stable Release #6 - 10 Newly Updated ISOs, Hub Builds & Proxmox/LXC builds
17th Apr, 2024
v18.0 Stable Release #5 - 20 Updated ISOs & Hub Builds - Proxmox/LXC builds
12th Feb, 2024
v18.0 Stable Release #4 - 10 Updated ISOs & Hub Builds - Proxmox/LXC builds
5th Feb, 2024
1 of 63
next ›
Archive
August 2024
(1)
July 2024
(1)
April 2024
(1)
February 2024
(2)
November 2023
(1)
October 2023
(1)
September 2023
(1)
July 2023
(1)
April 2023
(1)
March 2023
(1)
December 2022
(1)
November 2022
(2)
Pages
Recent comments
V19?
TurnKey LXC templates are available via the Proxmox UI
Non-Aws platforms
Thanks for that!
the command for adding the
mtoolshub
Great post
randomness