Thursday, February 19, 2009

Random Useful References

Some references that might interest people taking this course.

- NetworkX.

A Python package for modelling graphs and networks:

http://networkx.lanl.gov/


- Graphviz.

A graph visualization tool (works with NetworkX):

http://www.graphviz.org/


- A line-by-line Python profiler

A Python profiler that gives you performance information about each line on your code, so you can attack the main bottlenecks first.

http://www.enthought.com/~rkern/cgi-bin/hgwebdir.cgi/line_profiler/summary


- "The visual display of quantitative data", a classic book by Edward Tufte.

http://www.edwardtufte.com/tufte/books_vdqi


- The NEOS server and AMPL

A server that accepts optimization jobs over the web. High-end hardware and high-quality solvers for free! Accepts jobs in the AMPL langugage.

http://www-neos.mcs.anl.gov/
http://www.ampl.com

- The Stony Brook Algorithm Repository

Search for algorithms by problem or by language. The website also provides a ranking of the algorithms (presumably which ones are "best"). For example, I needed an algorithm for matching in bipartite graphs. I just searched under "Graph problems - Polynomial type problems" -> "Matching" and found a list of them, with their ranking.

http://www.cs.sunysb.edu/~algorith/


- "Large Scale Data Analysis Challenges" (talk at Caltech next week)

Tuesday, February 24th
12:00 - 1:00pm
74 Jorgensen

*Lunch will be provided*

SPEAKER:
Dan Meiron
Fletcher Jones Professor of Applied & Computational Mathematics and
Computer Science

TITLE:
Large Scale Data Analysis Challenges

ABSTRACT:
JASON, a scientific advisory group, was asked by representatives of the
Department of Defense (DOD) and the Intelligence Community (IC) to
recommend ways in which the DOD/IC can handle present and future
sensor data in fundamentally different ways, taking into account both the
state-of-the-art, the potential for advances in areas such as data structures,
the shaping of sensor data for exploitation, as well as methodologies for
data discovery.

In this presentation we will examine the challenges associated with the
analysis of large data and in particular compare DOD/IC requirements to
those of several data intensive fields such as high energy physics and
astronomy. The conclusion is that while DOD/IC data requirements are
certainly significant, they are not unmanageable given the capabilities
of current and projected storage technology.

The key challenge will be to adequately empower DOD and IC analysts by
matching analysis needs to data delivery modalities. At a very cursory level,
we will examine some current approaches that could enable better information
fusion. We'll also propose various grand challenges that could be used to
assess and prioritize future research efforts in data assimilation and fusion.

No comments:

Post a Comment