Puttering with a proposal for a web API for numerical data sharing


/ Published in: Python
Save to your folder(s)



Copy this code and paste it in your HTML
  1. """Setting up a web service for sharing numerical data
  2.  
  3. This came up in the context of people doing quantified-self experiments
  4. but I think it can be valuable in several contexts, if done well. General
  5. principles:
  6.  
  7. (1) Data should be as self-documenting as possible, via semantic
  8. markup or something similar.
  9. (2) Simultaneously, there should be as few as possible a-priori
  10. constraints on how data is formatted.
  11. (3) Data should be human-readable, or easily rendered human-readable.
  12. (4) There should be some provision for controlling access to data.
  13.  
  14. So I think it makes sense to offer data in a JSON format.
  15.  
  16. To support ease of understanding data semantically, data objects may
  17. refer to a descriptor object. If they don't, they should contain their
  18. own descriptive information.
  19.  
  20. To avoid confusion between different objects, every object with any
  21. likelihood of persistence (the exception being a request template sent
  22. to the web service) should include a randomly generated 128-bit UUID.
  23.  
  24. An alternative to the UUID would be a permanently-assigned URI, in
  25. keeping with the Semantic Web way of doing things. It might make sense
  26. that any object shared publicly would have a URI.
  27.  
  28. Privacy control can be done a few different ways. Obviously a server
  29. could operate behind a firewall and the system could be set up to
  30. prohibit data sharing beyond the firewall. But for data that can't
  31. just live behind a firewall there should be some more nuanced
  32. provision for controlling privacy.
  33.  
  34. A web service should allow users to:
  35. (1) Post descriptor and data objects on the server.
  36. (2) Specify who is allowed to fetch a particular object.
  37. (a) Create groups of the users on the system.
  38. (b) Use a set-union of groups and individual users to specify
  39. who is allowed to fetch an object. (See example below, where
  40. an individual is "Bob Smith" and a group is "weatherbuffs".)
  41. (3) Fetch data and descriptor objects using a template.
  42. """
  43.  
  44. import json
  45. import pprint
  46. import datetime
  47. import urllib
  48.  
  49. ########### Some preliminaries ##########################
  50. #
  51. # As of this writing I don't yet have a server working.
  52. # But the following shows how client code will access it
  53. # when it exists.
  54.  
  55. SERVER_URL = 'http://127.0.0.1/data-server/'
  56.  
  57. def httpPostRequest(params):
  58. f = urllib.urlopen(SERVER_URL, urllib.urlencode(params))
  59. data = f.read()
  60. f.close()
  61. return json.loads(data)
  62.  
  63. def httpGetRequest(params):
  64. f = urllib.urlopen(SERVER_URL + '?' + urllib.urlencode(params))
  65. data = f.read()
  66. f.close()
  67. return json.loads(data)
  68.  
  69. HOUR = datetime.timedelta(hours=1)
  70.  
  71. ########### Descriptors and data #####################
  72.  
  73. def generateId():
  74. import hashlib
  75. import random
  76. h = hashlib.sha1()
  77. h.update(repr(random.random()))
  78. return h.hexdigest()
  79.  
  80. descriptor = {
  81. 'id': generateId(), # a URI could also work here
  82. 'summary': 'A sequence of temperature samples in time',
  83. 'creator': 'Will Ware <[email protected]>',
  84. 'format': {
  85. 'units': 'fahrenheit',
  86. 'period': 1, # seconds
  87. },
  88. }
  89.  
  90. now = datetime.datetime.now() # how to handle time zones? UTC?
  91.  
  92. data = {
  93. 'timestamp': now.strftime('%Y/%m/%d %H:%M:%S.%f'),
  94. 'latitude': 42.0,
  95. 'longitude': -71.0,
  96. 'id': generateId(),
  97. 'descriptor': descriptor['id'],
  98. 'creator': 'Will Ware <[email protected]>',
  99. 'experiment': 'Outdoor temperature near Will\'s house',
  100. 'summary': 'Anything summary-wise more specific than the descriptor',
  101. 'samples': [
  102. 82.0, 85.0, 83.0, 84.0, 86.0
  103. ],
  104. 'visible-to': [
  105. 'Bob Smith', 'weatherbuffs'
  106. ]
  107. }
  108.  
  109. pprint.pprint(descriptor)
  110. print
  111. pprint.pprint(data)
  112.  
  113. if False:
  114. # submit a descriptor object
  115. httpPostRequest(descriptor)
  116.  
  117. # submit a data object
  118. httpPostRequest(data)
  119.  
  120.  
  121. #################################################
  122. # Here is a template for fetching all data objects less than two hours
  123. # old, created by Will, and using the descriptor appearing above.
  124.  
  125. template = {
  126. 'timestamp__gt': # server will probably be written in Django
  127. (now - 2 * HOUR).strftime('%Y/%m/%d %H:%M:%S.%f'),
  128. 'creator': 'Will Ware <[email protected]>',
  129. 'descriptor': descriptor['id'],
  130. }
  131.  
  132. print
  133. pprint.pprint(template)
  134.  
  135. if False:
  136. print httpGetRequest(template)

URL: http://edison.thinktrylearn.com/experiments/show/198

Report this snippet


Comments

RSS Icon Subscribe to comments

You need to login to post a comment.