Poundstone
Tuesday, March 10, 2009
Sunday, February 8, 2009
Corporate Wikis - Information At the Speed of Thought?
Businesses survive on a diet of information. Information needs to be easily accessible and well organised. Those who can modify this information need to be able to do this easily but within a framework of revision control and editorial control.
Businesses remain too document centric. Information is locked in boxes with poor linkage to other information. The default process is to send documents to colleagues as email attachments.
Information is put onto intranet sites but rather than as a standard HTML page, the information is embedded into a short document or slide pack. Downloading the document and activating the appropriate reader application takes long enough for the reader to become disengaged.
Documents are important for some situations particular those relating to contractual, technical specifications etc.
A solution that offers itself is a corporate wiki. Wikis are not suitable for all documents and the following issues present themselves. If they cannot be resolved then a document needs to be used.
Issues that need to be considered before implementing a wiki:
Businesses remain too document centric. Information is locked in boxes with poor linkage to other information. The default process is to send documents to colleagues as email attachments.
Information is put onto intranet sites but rather than as a standard HTML page, the information is embedded into a short document or slide pack. Downloading the document and activating the appropriate reader application takes long enough for the reader to become disengaged.
Documents are important for some situations particular those relating to contractual, technical specifications etc.
A solution that offers itself is a corporate wiki. Wikis are not suitable for all documents and the following issues present themselves. If they cannot be resolved then a document needs to be used.
Issues that need to be considered before implementing a wiki:
- Platform needs to be fast (wiki after all means quick). Hardware and network access needs to be swift. Navigating, search and editing needs to be at the 'speed of thought'.
- Requires some organisational and editorial structure.
- Offline working.
- Printing/document-forming.
- Baselining. Whilst wikis do track updates it is often required that baselines be formed. For example to align with a particular version of a product or a point in time (ie end of quarter).
Monday, November 3, 2008
GIS with Haskell 1
It's time to bite the bullet and do some GIS Haskelling.
My first project is to develop a simple map server in Haskell. Here are the ingredients:
The following gives a map showing the suburbs centred on the city of Melbourne.
The components that make up the map definition:
A slightly more complex example with two layers is the following:
Next steps are to source some population data, to colour code the suburbs depending on population and to include a legend.
My first project is to develop a simple map server in Haskell. Here are the ingredients:
- PostgresSQL + PostGIS
- Some data to put into the database. For this I sourced some Australian suburb boundaries.
- A library for manipulation GIS geometry, GEOS. In particular this provides functions to parse WKT strings from the database into geometry structures.
- Haskell CGI package
- Haskell bindings to the GEOS library.
- Extension to HaXML adding SVG combinators.
The following gives a map showing the suburbs centred on the city of Melbourne.
ex0 = map (connection "host=localhost user=postgres password=postgres dbname=australia"The resulting SVG file when viewed looks like:
`u` size 700 700
`u` extents 144.897079467773 (-37.8575096130371) 0.16821284479996734 0.1410504416999956
`u` layer ( table "suburbs"
`u` geometry "the_geom"
`u` klass ( style ( outlinecolour 255 0 0 1
`u` colour 100 255 100 1))
))
The components that make up the map definition:
- connection - supplies the database connection parameters,
- size - is the size of the SVG output to be generated.
- extents - is the extents of the map in world coordinates.
- layer - this defines a layer. The table property defines the database table to use, the geometry property defines the column name to use. The klass parameter defines the style to use for drawing the geometry.
A slightly more complex example with two layers is the following:
ex1 = map (connection "host=localhost user=postgres password=postgres dbname=australia"This is the same as before but with a new layer that provides a thick border around the edge of all the suburb boundaries:
`u` size 700 700
`u` extents 144.897079467773 (-37.8575096130371) 0.16821284479996734 0.1410504416999956
`u` layer ( table "suburbs"
`u` geometry "the_geom"
`u` labelitem "name_2006"
`u` klass ( style ( outlinecolour 255 0 0 1
`u` colour 100 255 100 1))
`u` label (colour 255 255 0 1))
`u` layer (table "suburbs"
`u` geometry "geomunion(the_geom)"
`u` klass ( style ( outlinecolour 0 0 0 1 `u` width 4)))
)
Next steps are to source some population data, to colour code the suburbs depending on population and to include a legend.
Friday, September 26, 2008
Financial Contracts, Haskell and Probability
This article brings together the ideas presented in the paper 'How to write a financial contract' (HWFC) and Martin Erwig's FPF module.
We are going to deal with a simple but common situation in finance - if I have a contract where I am going to receive $100 dollars in 3 years time what is that 'contract' worth to me now. How much would I pay to obtain that contract? In order to calculate the worth we need to consider what else I would do with the money and the most obvious action is to deposit it into a bank account that attracts interest.
The question is reposed then as: if I put x into a bank account then what is x if the final amount after 3 years is $100. This is easy if the interest is fixed, not so easy if it varies.
This blogpiece will provide a fragment of the implementation of HWFC that answers the above.
As this is literate Haskell some preliminaries:
HWFC introduces the concept of a value process which is a function from time to a random variable. We shall equate a random variable with a probability distribution and a definition of a value process is:
For our interest rate model let us say that from year to the next the interest rate can either stay the same, increase by 1% or decrease by 1% all with equal likelihood. We can express this as:
The *. function allows us to repeat a random process n times. The process here is to start with an interest rate and to move to the next years rate.
If this year the rate is 10%, after a couple of years the distribution looks like:
Let us put that to one side and look at the contracts side of things. I will short circuit the approach in the paper and dive directly into the valuation
This is a value process such that if when the first argument is true, return the second otherwise calculated the discounted value of the first argument.
Lets start with a trivial example to make sure that things are working as planned
The value of this contract, as a random variable, is:
The value of this contact is:
The PFP library has a function to provide the expected value which can be ask of a distribution. The expected value of our contract is:
We are going to deal with a simple but common situation in finance - if I have a contract where I am going to receive $100 dollars in 3 years time what is that 'contract' worth to me now. How much would I pay to obtain that contract? In order to calculate the worth we need to consider what else I would do with the money and the most obvious action is to deposit it into a bank account that attracts interest.
The question is reposed then as: if I put x into a bank account then what is x if the final amount after 3 years is $100. This is easy if the interest is fixed, not so easy if it varies.
This blogpiece will provide a fragment of the implementation of HWFC that answers the above.
As this is literate Haskell some preliminaries:
> module Main where
>
> import Probability
HWFC introduces the concept of a value process which is a function from time to a random variable. We shall equate a random variable with a probability distribution and a definition of a value process is:
> type PR a = Int -> Dist a
For our interest rate model let us say that from year to the next the interest rate can either stay the same, increase by 1% or decrease by 1% all with equal likelihood. We can express this as:
> interest :: Floating a => a -> PR a
> interest i n = (n *. one) i where one start = uniform [start+1/100,start,start-1/100]
The *. function allows us to repeat a random process n times. The process here is to start with an interest rate and to move to the next years rate.
If this year the rate is 10%, after a couple of years the distribution looks like:
interest 10 2
10.0 33.3%
9.99 22.2%
10.01 22.2%
9.98 11.1%
10.02 11.1%
Let us put that to one side and look at the contracts side of things. I will short circuit the approach in the paper and dive directly into the valuation
> data Obs a = O { evalObs :: PR a }
>
> konst k = O (\t -> certainly k)
> lift f (O pr) = O (\t -> fmap f (pr t))
> lift2 f (O pr1) (O pr2) = O (\t -> joinWith f (pr1 t) (pr2 t))
> date = O (\t -> certainly t)
>
> data Contract = C { evalContract :: PR Float }
> cconst k = C $ \ _ -> certainly k
> when o c = C $ disc (evalObs o) (evalContract c)
>
>
> at t = lift2 (==) date (konst t)
> zcb t x = when (at t) x
>
> whenFirstTrue :: PR Bool -> Int
> whenFirstTrue prb = f 0 where f i = if prb i == certainly True then i else f (i+1)
>
> baseRate = 10
This is a value process such that if when the first argument is true, return the second otherwise calculated the discounted value of the first argument.
> disc :: PR Bool -> PR Float -> PR Float
> disc prb prd t = if prb t == certainly True then prd t else let s = prd t
> t' = whenFirstTrue prb
> in discount baseRate s (t'-t)
>
> discount :: Floating a => a -> Dist a -> PR a
> discount int final time = let intspread = interest int time
> in joinWith (\i s -> s / (1+i/100)) intspread final
>
Lets start with a trivial example to make sure that things are working as planned
> ex1 = cconst 100
The value of this contract, as a random variable, is:
evalContract ex1 0
100.0 100.0
> ex2 = zcb 3 (cconst 100)
The value of this contact is:
evalContract ex2 0
90.90909 25.9%
90.900826 22.2%
90.91736 22.2%
90.89256 11.1%
90.92562 11.1%
90.88431 3.7%
90.93389 3.7%
The PFP library has a function to provide the expected value which can be ask of a distribution. The expected value of our contract is:
expected $ evalContract ex2 0
90.9091
Sunday, July 15, 2007
Haskell Mindset
There are several things an imperative programmer needs to address before fully getting to grips with Haskell:
All of the above are possible in an imperative language (or just about) but Haskell brings these to the fore and provides a syntax and semantics that is built around them. This encourages other ways of thinking that are different from those encouraged by the OO paradigm.
One key concept in Haskell is that of Monads. Some people really struggle with these and there a numerous tutorials on it. Some of these tutorials are good and some are bad. The one that worked for me was the one by Jeff Newburn. It helps if you have the above concepts in mind before embarking on the journey.
Monads are also about sequencing of computations. In imperative languages, there is typically only one way that operations are sequenced. This can be encoded in Haskell using a programmer defined Monad stack. See the description of HJS for an example of a simple stack for JavaScript that provides support for IO, state and exceptions.
- It is lazy. X is not evaluated 'when the program' reaches 'X = blah'. In fact it might never be.
- Changing state is not paramount. Forget the pigeon holes.
- Function arguments are not always necessary.
- Types can be inferred.
All of the above are possible in an imperative language (or just about) but Haskell brings these to the fore and provides a syntax and semantics that is built around them. This encourages other ways of thinking that are different from those encouraged by the OO paradigm.
One key concept in Haskell is that of Monads. Some people really struggle with these and there a numerous tutorials on it. Some of these tutorials are good and some are bad. The one that worked for me was the one by Jeff Newburn. It helps if you have the above concepts in mind before embarking on the journey.
Monads are also about sequencing of computations. In imperative languages, there is typically only one way that operations are sequenced. This can be encoded in Haskell using a programmer defined Monad stack. See the description of HJS for an example of a simple stack for JavaScript that provides support for IO, state and exceptions.
Haskell
As the Haskell Wiki says
Haskell is a general purpose, purely functional programming language featuring static typing, higher order functions, polymorphism, type classes, and monadic effects. Haskell compilers are freely available for almost any computer.
Some of my contributions to the Wiki and Hackage are:
HJS - A JavaScript interpreter.
Enterprise Haskell - Requirements for the use of Haskell in the real world.
HGene - The beginnings of a geneology program in Haskell.
HJS - A JavaScript interpreter.
Enterprise Haskell - Requirements for the use of Haskell in the real world.
HGene - The beginnings of a geneology program in Haskell.
Lists considered harmful
A quick post inspired by the paper "Stream Fusion From Lists to Streams to Nothing at All" All programming languages include features for lists/collections. The problem with your bog standard list is that there is no tie-back to what built the list. This means that the opportunity for any optimisation that you could get by fusing the creation of the list with its use, is lost.
Conversations with a type checker
Haskell encourages a high level of thought prior to putting down characters. One feature that has been noticed is that once written a Haskell program will usually do the right thing. Haskell moves the task from punching out characters to thinking about what you are writing and, importantly, getting the types consistent across the whole program.
Haskell does not force you to specify a type for everything. This enables you to develop a function iteratively and then to ask Haskell what it infers the type of the function to be. As an example, suppose you had a higher level function that you knew the general layout of. You know that the function calls other functions but are not sure what the types of these functions are. You can get an idea of their type by writing the top level function as if the lower functions where arguments to the higher function and then asking Haskell for the type of the top level function. The type signature would include information about the lower level functions.
Haskell does not force you to specify a type for everything. This enables you to develop a function iteratively and then to ask Haskell what it infers the type of the function to be. As an example, suppose you had a higher level function that you knew the general layout of. You know that the function calls other functions but are not sure what the types of these functions are. You can get an idea of their type by writing the top level function as if the lower functions where arguments to the higher function and then asking Haskell for the type of the top level function. The type signature would include information about the lower level functions.
Customisation or Configuration?
Following on from Peter Batty's topic and as something that has bothered me over the last year ...
The usual definition given for the configuration vs customsation dichotomy in terms of non-code/code is a good starting point but there is more to it than that. Often it is helpful to look at the context and issues around a term rather than get hot and bothered about its definition. Here are a couple of other ways of looking at it:
Firstly, there is, of course, the Total Cost of Ownership issue. Customisation leads to unpredictable total cost of ownership for systems. This is made up of the cost of building the customisation in the first place, the cost of supporting the customisation (including the cost of handling problems at the boundary of the customisation and the product) and, the scary one, the cost of migrating the customisation to new versions of the product. All of these are high risk and involve the user in areas that are not core to the business. An option here is to out source the activity and the risk.
The other way of looking at the distinction is in terms of what happens to my configuration and what happens to my customisation when the product version changes. With configuration the expectation is that the configuration is migrated seamlessly to the new platform, with customisation this is not expected to be the case. A parallel distinction is in what is supported and what is not - the product is expected to support my changing these this (configuration) but not that (customisation).
Both of the above avoid the need to use code/non-code to distinguish between customisation and configuration and this is good because there are times when 'changing a few parameters' is either too cumbersome or more complex logic is required. Scripting provides a means to do so and scripting through some form of DSL should not in principle be excluded from what is configuration. However in practice the way that scripting is currently managed is not going to cut the mustard. The difficulty is that the 'scripts' usually sit outside the system and are stored as flat ASCII. This leads to problems when the product, for instance, changes the name of a table, adds a new parameter to a function or even removes a function.
Amongst the things that people look for when they require the product to be extended are:
As Peter points out though, as the application becomes more tuned to a particular domain, it is expected that it includes all the things that the customer wants. Another way of putting this is that the product encodes 'best practice'.
The usual definition given for the configuration vs customsation dichotomy in terms of non-code/code is a good starting point but there is more to it than that. Often it is helpful to look at the context and issues around a term rather than get hot and bothered about its definition. Here are a couple of other ways of looking at it:
Firstly, there is, of course, the Total Cost of Ownership issue. Customisation leads to unpredictable total cost of ownership for systems. This is made up of the cost of building the customisation in the first place, the cost of supporting the customisation (including the cost of handling problems at the boundary of the customisation and the product) and, the scary one, the cost of migrating the customisation to new versions of the product. All of these are high risk and involve the user in areas that are not core to the business. An option here is to out source the activity and the risk.
The other way of looking at the distinction is in terms of what happens to my configuration and what happens to my customisation when the product version changes. With configuration the expectation is that the configuration is migrated seamlessly to the new platform, with customisation this is not expected to be the case. A parallel distinction is in what is supported and what is not - the product is expected to support my changing these this (configuration) but not that (customisation).
Both of the above avoid the need to use code/non-code to distinguish between customisation and configuration and this is good because there are times when 'changing a few parameters' is either too cumbersome or more complex logic is required. Scripting provides a means to do so and scripting through some form of DSL should not in principle be excluded from what is configuration. However in practice the way that scripting is currently managed is not going to cut the mustard. The difficulty is that the 'scripts' usually sit outside the system and are stored as flat ASCII. This leads to problems when the product, for instance, changes the name of a table, adds a new parameter to a function or even removes a function.
Amongst the things that people look for when they require the product to be extended are:
- Task Automation - This will involve getting data from the GUI (for instance the currently selected object), pushing data into the GUI (putting data into a text field) and triggering actions (pressing the 'Insert' button).
- Business Rule Validation - On entering data into the field, the system will fire any 'hooked-in' validation rules.
- Loading Data - This is either data from legacy systems as part of initial migration or on going alignment of data between systems.
- Dumping Data - This is typically in the form of a report, but may also be a data export to another systems.
As Peter points out though, as the application becomes more tuned to a particular domain, it is expected that it includes all the things that the customer wants. Another way of putting this is that the product encodes 'best practice'.
Saturday, January 6, 2007
The name ...
Just started to read 'The map that changed the world' by Simon Winchester. An easy-read historical book about William Smith who single handedly developed the first geological map of most of Britain. Poundstones were used by farmers for measuring weight rather than buying a metallic weight. Where William grew up many farmers choose flattened circular stones which were in fact fossils - Clypeus ploti.
Subscribe to:
Posts (Atom)
Blog Archive
About Me
- Poundstone
- Melbourne, Australia
- I work for GE in Melbourne Australia. All views do not necessarily represent GE.