Blog CeON-u

04.02.2012 – 15:42

Apache Hadoop at CEON, ICM UW

Currently, this post is available here only in Polish. It will be translated as soon as possible.

By Adam Kawa | Posted in Uncategorized | Tagged apache, bwmeta, ceon, giraph, hadoop, hbase, hive | Comments (1)

03.28.2012 – 13:26

(Polski) Otwarcie Bloga Centrum Otwartej Nauki

By wpmaster | Posted in Aktualności | Comments (0)

01.11.2012 – 01:30

Assorted curiosities: Geography

Fun facts learned while clicking through Wikipedia:

Treasure Island in Ontario, Canada is probably the largest island in a lake in an island in a lake.
Liechtenstein and Uzbekistan are doubly landlocked countries, i.e., all the neighbouring countries are landlocked.
Republic of Kalmykia is the only predominantly Buddhist region of Europe.
The Door to Hell is a 70 m (230 ft) wide hole near the village of Derweze in Turkmenistan, filled with natural gas, which has been burning since 1971.

The Door to Hell near Derweze, Turkmenistan

By Łukasz Bolikowski | Posted in curiosities, wikipedia | Comments Off

01.11.2012 – 01:30

Assorted curiosities: Geography

Fun facts learned while clicking through Wikipedia:

Treasure Island in Ontario, Canada is probably the largest island in a lake in an island in a lake.
Liechtenstein and Uzbekistan are doubly landlocked countries, i.e., all the neighbouring countries are landlocked.
Republic of Kalmykia is the only predominantly Buddhist region of Europe.
The Door to Hell is a 70 m (230 ft) wide hole near the village of Derweze in Turkmenistan, filled with natural gas, which has been burning since 1971.

The Door to Hell near Derweze, Turkmenistan

By Łukasz Bolikowski | Posted in ceon | Tagged ceon | Comments Off

04.04.2013 – 16:34

Assorted links

Some assorted links collected this week:

A new interestingly looking book “Web Social Science” by Robert Ackland coming out in July 2013.
In recent issue of Nature (Match 28): a special on the future of scientific publishing.
An interesting TEDtalk by Colin Camerer on neuroscience and experimental economics
Nice paper analyzing world email traffic, co-authored by Michael Macy. Another example of using ‘igraph’ package for network analysis.
Gary King and Stuart Shieber on Open Access science and publishing.

There are discussions in various places about merits, pitfalls, and misunderstandings related to buzzwords “bigdata”, “data science” (what a useless term it is…) etc., analysis being “data-driven” or “evidence-based” etc. Perhaps I will make a separate post on that at some point… For now:

“Let the Data Speak for themselves”, a guest post by Joseph Rickert on Revolutions blog
Echoes and comments of Nate Silver’s acclaimed book “The Signal and the Noise”, for example:
- Matt Asay on readwrite (hat tip to Dominik Batorski), and here
- at NYT
David Brooks at NYT
Petr Keil on data-driven science

By Michał | Posted in ceon | Tagged ceon | Comments Off

06.19.2012 – 23:52

Correction to intergraph update

It turned out that I wrote the last post on “intergraph” package too hastily. After some feedback from CRAN maintainers and deliberation I decided to release the updated version of the “intergraph” package under the original name (so no new package “intergraph0″) with version number 1.2. This version relies on legacy “igraph” version 0.5, which is now called “igraph0″. Package “intergraph” 1.2 is now available on CRAN.

Meanwhile, I’m working on new version of “intergraph”, scheduled to be ver. 1.3, which will rely on new version 0.6 of “igraph”.

I am sorry for the mess.

By Michał | Posted in ceon | Tagged ceon | Comments Off

11.02.2010 – 14:28

intergraph+network: no hacking necessary

A short update on network+intergraph R packages story:

Couple of days ago Carter Butts released a new version of the ‘network’ package (ver. 1.5-1). It has a namespace now. Consequently, the ‘intergraph’ package should work out-of-the-box. There is no need to install my hacked version of the ‘network’ package anymore.

By Michał | Posted in ceon, network | Tagged ceon | Comments Off

03.08.2011 – 20:23

Math in the social sciences, with discussion

Nice discussion on the usefulness, or lack thereof, of mathematics and formal theory building in the social sciences. Make sure you have a look at the comments. More or less chronologically:

sociolgy needs more… @ orgtheory.net by Fabio Rojas
math and sociology @ orgtheory.net by Fabio Rojas
methodological convergence in the social sciences @ Marc F. Bellemare

With some appraisal here, here, and to some extent here.

Somewhat in parallel, a discussion about the death of theoretical (read mathematical) economics at econlog:

The decline of economic theory @ econlog by Bryan Caplan
Response on orgtheory.net

All in all, I subscribe to Fabio’s call with both hands.

My subjective list of advantages of formal theory building in social sciences supplementing the one at orgtheory.net:

If a theory is, among other things, a logically coherent set of propositions then formalizing it is just a translation to a language that makes analyzing it, especially deducing consequences, much easier. And this applies to whatever the subject of the theory is.
Most of the empirical studies in sociology are analyzed using some form of statistical reasoning, which is mathematical. Given that, building a formal theory of the studied phenomenon should in principle allow for a tighter connection between the theory and empirics (c.f. The Theory-Gap in Social Network Analysis by Mark Granovetter).
I would also add the “accumulativeness”, much in the line of Formal Rational Choice Theory: A Cumulative Science of Politics by David Lalman, Joe Oppenheimer, and Piotr Swistak. Although, I have to admit, after having spent 5 years or so studying mathematical sociology and selective works from mathematical economics, the cumulation is sometimes difficult to observe from a local point of view and local time scale of individual researcher. There are so many specific models (strong assumptions etc.), and it is frequently hard to understand the bigger picture. Perhaps it is just the question of time for a “unification” to arrive, … or a researcher…
?

By Michał | Posted in ceon | Tagged ceon | Comments Off

12.28.2011 – 02:30

My first use of the state monad

While learning Haskell, I was looking for a concise implementation of a function which "reshapes" a list into a matrix. Given the number of rows r, the number of columns c, and a list vs, the function should take r*c values from the list and create a r by c matrix out of them. Here's the type:

toMatrix :: Int -> Int -> [a] -> [[a]]

First solution

First I wrote a simpler function that would split a list into chunks of a given size, like this:

chunksOf :: Int -> [a] -> [[a]]
chunksOf _ [] = []
chunksOf c vs = h : (chunksOf c t)
    where (h, t) = splitAt c vs

Using the above, toMatrix could be implemented this way:

toMatrix r c = chunksOf c . take (r*c)

I had a feeling that a function like chunksOf should be already present somewhere in the standard library, so I asked Hoogle, but to no avail. There was chunksOf in Data.Text, but it operated on Text only (I retroactively named my function after the one in Data.Text). However, Hoogle returned replicateM as well…

Second solution

… and I realized I could use it with the state monad to implement toMatrix. The state could contain the list of values yet to be consumed, and the action to be replicated could be chopping off c values from the list:

splitOnce :: Int -> State [a] [a]
splitOnce c = do
    s <- get
    let (h, t) = splitAt c s
    put t
    return h

After a while I realized that the same function could be written in a much more concise form:

splitOnce' :: Int -> State [a] [a]
splitOnce' = state . splitAt

The solution was, therefore:

toMatrix r = evalState . replicateM r . state . splitAt

Summary

The second version is good enough for me and as a bonus it helped me understand the state monad. Note that the two implementations of toMatrix are not equivalent, as they handle lists shorter than r*c in different ways. Future work: find a concise and preferably point-free implementation of chunksOf.

Update (2013-01-08)

This answer on StackOverflow contains a very nice implementation of chunksOf.

By Łukasz Bolikowski | Posted in ceon | Tagged ceon | Comments Off

12.28.2011 – 02:30

My first use of the state monad

While learning Haskell, I was looking for a concise implementation of a function which "reshapes" a list into a matrix. Given the number of rows r, the number of columns c, and a list vs, the function should take r*c values from the list and create a r by c matrix out of them. Here's the type:

toMatrix :: Int -> Int -> [a] -> [[a]]

First solution

First I wrote a simpler function that would split a list into chunks of a given size, like this:

chunksOf :: Int -> [a] -> [[a]]
chunksOf _ [] = []
chunksOf c vs = h : (chunksOf c t)
    where (h, t) = splitAt c vs

Using the above, toMatrix could be implemented this way:

toMatrix r c = chunksOf c . take (r*c)

I had a feeling that a function like chunksOf should be already present somewhere in the standard library, so I asked Hoogle, but to no avail. There was chunksOf in Data.Text, but it operated on Text only (I retroactively named my function after the one in Data.Text). However, Hoogle returned replicateM as well…

Second solution

… and I realized I could use it with the state monad to implement toMatrix. The state could contain the list of values yet to be consumed, and the action to be replicated could be chopping off c values from the list:

splitOnce :: Int -> State [a] [a]
splitOnce c = do
    s <- get
    let (h, t) = splitAt c s
    put t
    return h

After a while I realized that the same function could be written in a much more concise form:

splitOnce' :: Int -> State [a] [a]
splitOnce' = state . splitAt

The solution was, therefore:

toMatrix r = evalState . replicateM r . state . splitAt

Summary

The second version is good enough for me and as a bonus it helped me understand the state monad. Note that the two implementations of toMatrix are not equivalent, as they handle lists shorter than r*c in different ways. Future work: find a concise and preferably point-free implementation of chunksOf.

Update (2013-01-08)

This answer on StackOverflow contains a very nice implementation of chunksOf.

By Łukasz Bolikowski | Posted in ceon | Tagged ceon | Comments Off