OCF Big Data Symposium 2015

De Wiki de Ilan Chamovitz - Informatica, Administracao, Eng. Producao, Educacao, Tecnologia da Informacao

Manchester Informatics & OCF Big Data Symposium Showcasing the diversity of Big Data Analytics


When? 18th June 2015 Time? 09.30am-5pm

Where? Lecture Theatre 1.1, Kilburn Building, Oxford Road, University of Manchester


Manchester Informatics (University of Manchester) and OCF present a one-day Big Data Symposium to showcase the diversity of big data analytics in higher education, research and the public sector. Featuring presenters from Future Cities/Manchester City Council, SKA, ARM, Oracle and the University, the day aims to highlight the latest projects and themes currently being undertaken and discussed in this rapidly growing field.


http://www.datascience.manchester.ac.uk/


The University of Manchester Data Science Institute.

Manchester's Data Science Institute acts as an access point to the University’s expertise in data science, facilitates interactions between data science researchers and problem holders, owns the University’s data science strategy, and will deliver sustainable support for the community.

Manchester has an engaged data science community of almost 250 investigators, with methodologists embedded in Schools across the University addressing problems in extracting meaning from data, managing data volume, the variety of data used in analyses, the velocity with which it is produced and the veracity of those data.

Data science has a home in all four of the University's faculties (Engineering and Physical Sciences, Humanities, Life Sciences and Medical and Human Sciences) supported across the whole data life cycle by work in the schools of Computer Science and Mathematics. From information management, through analytics, to practical applications. This creates a virtuous circle, where challenging real-world problems drive the methodology research agenda, whilst providing a natural driver for building new algorithms and methods.


   Speakers


As keynote speaker, Professor Magnus Rattray will present 'Biology in the era of big data.'
Speakers will also include: Steve Turner (Manchester City Council) Miles Deegan (SKA) Tim Harris (Oracle) John Goodacre (Arm Ltd) Michael   
Gleaves (Hartree) Mark Elliot (University of Manchester) Chris Brown (OCF.)

Please visit the Manchester Informatics events page for further information on the presentations.

   Agenda 


Please see below for a list of speakers and abstracts.
 Time 	Presentation
9.30am – 10am 	Registration and coffee
10am 	Welcome – Robert Stevens, Professor of Computer Science, School of Computer Science, University of Manchester
10.15am 	Keynote Presentation
Prof Magnus Rattray, Professor of Computational and Systems Biology, University of Manchester – “Biology in the era of Big Data”
11am – 12pm 	Session chair, Nick Dingle – OCF   

Mark Elliot, Senior Lecturer in Social Statistics, University of Manchester – “Big Data, Privacy and all that stuff”
Miles Deegan, Engineering Project Manager, SKA - “21st Century Radio Astronomy and its data challenges – a progress report on the SKA”
Norman Paton, Professor in School of Computer Science, University of Manchester – “Data Wrangling: The Elephant in the Room of Big Data”
12pm – 1.30pm 	Lunch, networking and posters
1.30pm – 2.30pm 	Session chair, Stephen Checkley – AstraZeneca

John Goodacre, Director of Technology and Systems, Arm Ltd – “EUROSERVER: Riding the perfect storm”
Michael Gleaves, Head of Business Development, Hartree – “Hartree Centre – Data one of the four forces of change”
Chris Brown, OCF – “The Connected World”
2.30pm 	Coffee
3pm – 4pm 	Session chair, Robin Pinning – University of Manchester  

Sarah Bridle, Professor of Astrophysics, University of Manchester “ Big Data in Astronomy: The Large Synoptic Survey Telescope”
Steve Turner, Manchester City Council – “Data in the Urban Environment”
Tim Harris, Architect, Oracle Labs – “Can Big Data be In-Memory ”
 4pm 	Panel session 

Panel speakers:
David Topping – University of Manchester
Magnus Rattray – University of Manchester
John Goodacre – Arm Ltd
Mark Elliot – University of Manchester
Adrian Slatcher – Manchester City Council
4.30pm 	Wrap up
4.35pm 	Drinks, networking and posters

Keynote Speakers

Prof Magnus Rattray, Professor of Computational and Systems Biology, University of Manchester

Title - Biology in the era of big data

Biology has become a data-rich science and the way that we do biology is changing to adapt. This change is being driven by the arrival of large datasets from new technologies for DNA sequencing, high-resolution imaging and microfluidics. These technologies allow high-throughpu t biological investigations at the level of individual cells and across large numbers of cells, with the potential to profile every cell in a biological sample. Such single-cell technologies are transforming our understanding of biology and medicine in many areas, e.g. we can now characterise the mutations of the individual cells forming a tumour, or quantify the expression of genes within every immune cell in a blood sample. I will describe how these technological advances are being matched by advances in data analysis, and review some of the new modelling, visualisation and inference algorithms being developed to make progress in this data-intensive era.

Speakers

Mark Elliot, Senior Lecturer in Social Statistics, University of Manchester

Title - Big Data, Privacy and all that stuff

This presentation will consider the role of privacy in the era of big data. Famous quotes such “You have zero privacy anyway get over it” by Scott McNealy and “Privacy is dead” by Michael Hyatt suggest that we should stop fretting about privacy and get on and enjoy the undoubted fruits of the digital era. However, these views are based on fundamental misunderstanding about what privacy is and the role it plays in defining society. Here I discuss the meaning of privacy contextualised by “the data environment”. Our argument is that privacy is and always has been so contextualised but in the age of ubiquitous data far from privacy being dead we have historic opportunity to re-define a more proactively constructed form of privacy. Until now privacy has co-evolved with society; now we have the opportunity to design the type of privacy we want. Decisions about privacy have fundamental relationships with companion concepts: autonomy, identity and democracy. So decisions about privacy are no less than decisions about what sort of society we want.

Miles Deegan, Engineering Project Manager, SKA

Title – 21st Century Radio Astronomy and its data challenges – a progress report on the SKA

The Square Kilometre Array (SKA) Project is nearly two years into its pre-construction phase, with around 500 scientists and engineers around the world contributing to the design of its constituent elements. One of these elements, the Science Data Processor, is concerned with tackling the e-infrastructure challenges the SKA poses and devising the high performance computing and big data capabilities that will be required to turn exabytes of raw data into astronomical discoveries. This talk will discuss these challenges and outline potential solutions.

Norman Paton, Professor in School of Computer Science, University of Manchester

Title – Data Wrangling: The Elephant in the Room of Big Data

The challenges that face data management in the current era of increasingly rapid and widespread data production and publication are referred to as the four V’s of big data, namely Volume – the scale of the data, Velocity – speed of change, Variety – different forms of data, and Veracity – uncertainty of data. These highlight that although size matters in relation to big data, it isn't everything, and hint that there may be significant work to be done identifying the data that is relevant or suitable to a problem, and preparing the data for use. Data Wrangling is the process whereby data is extracted, selected, cleaned and combined to support application-specific analyses. This presentation reflects on the challenges that the four V's of big data bring to data wrangling, and on the advances that may be required to ensure that the costs of data wrangling do not become a significant impediment to the use of big data.

John Goodacre, Director of Technology and Systems, Arm Ltd

Title - EUROSERVER: Riding the perfect storm

Leveraging the processing capability of the latest ARM 64bit mobile processors, EUROSERVER as a European commission FP7 funded project is combining the technology trends of nanotechnology 3D integration, low-power SoC processor integration and the impossible requirements from next generation compute to investigate and build a solution for scalable, cost effective and flexible ARM-based server system architecture suitable across multiple markets. This talk will provide an overview of the ARM technology and introduce the vision and the goals for the project and the approach the consortium is taking to realize a ground breaking solution out of this perfect storm.

Michael Gleaves, Head of Business Development, Hartree

Title – Hartree Centre – Data one of the four forces of change

The Hartree Centre is focusing on delivering economic benefit through industrial engagement. Our research agenda includes the application High Performance Computing, Big Data and Visualisation capabilities to commercially relevant problems. I will case study some of the recent projects in data analytical techniques used within the Hartree Centre addressing real world problems in the construction, life sciences and resource planning sectors.

Chris Brown, OCF

Title- The Connected World

Described as the emergence of countless objects, animals and even people with uniquely identifiable, embedded devices that are wirelessly connected to the internet. These ‘nodes’ can send or receive information without the need for human intervention. There are estimates that there will be 50 billion connected devices by 2020 and all of these connections will be transmitting data, the data may be transitional or it may be persistent, this session will look at ways of capturing that persistent data and then looking at some sample use cases of how we can benefit from that plethora of information.

Sarah Bridle, University of Manchester

Title – Big Data in Astronomy: The Large Synoptic Survey Telescope

Recent technological advances have made it possible to carry out deep optical surveys of a large fraction of the visible sky. These surveys enable a diverse array of astronomical and fundamental physics investigations including: the search for small moving objects in the solar system, studies of the assembly history of the Milky Way, the exploration of the transient sky, and the establishment of tight constraints on models of dark energy using a variety of independent techniques. The Large Synoptic Survey Telescope (LSST) brings together astrophysicists, particle physicists and computer scientists in the most ambitious project of this kind that has yet been proposed. With an 8.4 m primary mirror, and a 3.2 Gigapixel, 10 square degree CCD camera, LSST will provide nearly an order of magnitude improvement in survey speed over all existing optical surveys, and those which are currently in development. Currently being constructed, and due to enter commissioning in 2020, in its first month of operation LSST will survey more of the universe than all previous telescopes built by mankind. Over the full ten years of operation, it will survey half of the sky in six optical colors, discovering four billion new galaxies and 10 million supernovae. At least 800 distinct images will be taken of every field, enabling a plethora of statistical investigations for intrinsic variability and for control of systematic uncertainties in deep imaging studies. LSST will produce 15 TB of data per night, yielding a data set of over 100 PB over ten years. Dedicated Computing Facilities will process the image data in near real time, and issue worldwide alerts within 60 seconds for objects that change in position or brightness. In this talk I will describe some of the challenges presented by LSST data analysis.

Tim Harris – Architect, Oracle Labs

Title – Can Big Data be In-Memory

The monikers of “big data” and “in-memory” are certainly hyped in the database world, but some people might argue that they don’t overlap. The terms are vague enough to be treated as mutually exclusive as well as mostly overlapping. In addition, “big data” often implies “data science” which means “not SQL” (based on some programmable framework like Map-Reduce, Spark, or Flink). How do we see the “in-memory” and “big data” trends for analytics evolving in the future (separately or together) and what is the role of SQL vs. other frameworks?

Steve Turner, Manchester City Council

Title- Data in the Urban Environment

Increasingly data will be used by cities to make more informed decisions to improve their relative competitiveness. The choice of infrastructure and the types of services offered are all examples of where data science will be increasingly important discipline. There are challenges however, not least around governance, ownership and quality.

 Notes and Links taken along the event... 


9.30am – 10am
Registration and coffee
10am
Welcome
10.15am
Keynote Presentation
Prof Magnus Rattray, Professor of Computational and Systems Biology, University of Manchester “Biology in the era of Big Data”
  High-throughput technologies, Single-cell and Genomic Medicine 
  Epi Genomic Project - http://www.roadmapepigenomics.org/ 
  Single-cell data - each cell is now a high-dimensional data point
  Clustering single cell protein data -  Amir et all Nature Biotech 2013 , 
     Stegle et al. 2014 -  http://www.nature.com/nrg/journal/v16/n3/abs/nrg3833.html
Genomic Medicine 
 http://www.genomemedicine.com/
 http://www.mangen.co.uk/index.php
 http://www.fmlmconference.co.uk/sessions/genomics-%E2%80%93-changing-face-clinical-care 
 Genomics – changing the face of clinical care Prof. Sue Hill.
 http://www.stfc.ac.uk/about-us/our-impacts-achievements/case-studies/advancing-disease-mapping-techniques-with-gsk/


 11am – 12pm
Mark Elliot, Senior Lecturer in Social Statistics, University of Manchester “Big Data, Privacy and all that stuff”
  Privacy is dead? "YOur nteril life is online
 Hyatt (2012)
 McNeally (1999)
   http://postscapes.com/internet-of-things-examples/
 http://www.democrata.co.uk/
 

 Miles Deegan, Engineering Project Manager, SKA - “21st Century Radio Astronomy and its data challenges – a progress report on the SKA”
  https://www.skatelescope.org/science/
 
 Norman Paton - 
 Talend - https://www.talend.com/
 VADA - Value Added Data Systems http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/M025268/1
 Data Wrangling - https://en.m.wikipedia.org/wiki/Data_wrangling


12pm – 1.30pm
Lunch, networking and posters
1.30pm – 2.30pm
 John Goodacre, Director of Technology and Systems, Arm Ltd “EUROSERVER: Riding the perfect storm”
 ARM http://www.arm.com/
 Sandia Report - http://www.sandia.gov/news/publications/annual_report/
 E-infrastructure strategy 
 https://www.gov.uk/government/publications/e-infrastructure-strategy-roadmap-for-development-of-advanced-computing-data-and-networks


 Michael Gleaves, Head of Business Development, Hartree “Hartree Centre – Data one of the four forces of change”
 http://www.stfc.ac.uk/innovation/ways-to-work-with-us/the-hartree-centre/hartree-centre-case-studies/
 http://www.stfc.ac.uk/innovation/ways-to-work-with-us/the-hartree-centre/


 Chris Brown, OCF – “The Connected World”

2.30pm

Coffee
3pm – 4pm
Tim Harris, Software Developer – Architect, Oracle “Can Big Data be In-Memory”
 http://www.dataversity.net/spiderbooks-spidergraph-linking-datasets-help-sell-better/
 http://docs.oracle.com/cd/E56133_01/about/future.html 
 https://en.m.wikipedia.org/wiki/Domain-specific_language 
Steve Turner, Manchester City Council “Data in the Urban Environment”
 Manchester Corridor - http://www.corridormanchester.com/welcome 
 Manchester Infrastructure map - http://www.placenorthwest.co.uk/news/may-launch-date-for-manchester-infrastructure-map/
 Manchester smart city - http://macf.ontheplatform.org.uk/article/manchester-smart-city
 
Sarah Bridle, LSST 
 https://en.m.wikipedia.org/wiki/Large_Synoptic_Survey_Telescope
  Accelerating Universe  -  https://en.m.wikipedia.org/wiki/Accelerating_universe
  https://en.m.wikipedia.org/wiki/Abell_2218
  http://lsst.org/lsst/science/scientist_cosmic_shear
  http://www.lsst.org/lsst/scibook


TBC
4pm
Panel session
4.30pm
Wrap up
4.35pm
Drinks, networking and posters.
Ferramentas pessoais