[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [kDev] Re: Metacrap -- and now read this...



All,

Data mining is a nice concept, as well as any nice
concept.

Kendra is talking "system" and obviosuly needs an
application (Kendra Tools). 

I have a bunch of questions that we should ask
ourselves, perhaps some of them already answered...

 The question should be: what types (categories) of
information Kendra wants to invite people to publish
and how other people can easily read / view / listen
to it without having to go through an intensive
training program (cause they won't). 
How much data and what types of data will be
published, in what formats?
Is it to be contained in Kendra's own server /dta
repository or is it going to be on content owner's own
storage, yo just provide the link?
Yes people will write in text format (do you expect
them to write in XML?) And yes other people will read
text format and they are not interested what is behind
it. AND THEY WANNA READ IN THEIR OWN LANGUAGE.  The
latest report to the EU (unreleased to public yet) has
shown:  today above 70% + of all the content on the
WEB is in languages other than English.

Quick note:  the fact that some software claims to be
unicode compliant does not necessarily mean that some
WEB user will be able to post his content in his
language and some other user will be able to view /
read it in his (different ) language.

And yes there is at least one solution for this.

Just to check basic assumptions: you mention "system"
- since everything is public then whatever structure
you would like to use it has to be open source. It has
to be easy for other information banks to link up and
communicate, most likely using WEB services
technology. That points strongly to something
XML-based. 

Actually, there is a more or less standard process you
have to go through, in terms of asking the right
questions.   I suggest to set up a little conference
call where we can brainstorm. Perhaps I could ask a
professional data publishing systems consultant (my
colleague) join the call. This is also in line with
some pragmatic suggestions regarding the project plan.
In order to plan the project you have to define the
goal (deliverables) of the project first. Define
stages of development of Kendra environment, etc.

who would like to participate?

Daniel, perhaps you could have a suggestion when and
how? (number to call, there is lots of them conference
call or audiobridge facility providers).

Cheers,

Constantine
 
--- Daniel Harris <daniel@xxxxxxxxxxxxx> wrote:
> Hi there Neil and All,
> 
> Yes, I pretty much agree with these 2 guys that
> centralising 
> definitions of metadata doesn't work very well.
> Which, I guess, is the 
> whole back bone of metadata (correct me if I'm
> wrong). Oh dear...
> 
> That's exactly why I've been saying in the project
> plan that we need to 
> enable people to talk in their own language and then
> allow anyone to 
> suggest links between different people's talkings.
> 
> Unfortunately, Neil, the more I think about this the
> more I think that 
> "data mining semi-structured data" as a foundation
> of kendraTools 
> doesn't work so well. Pull me back from the brink,
> please do!
> 
> We are both after the same goal: letting people talk
> in the way that 
> they want to. You seem to be saying: let people
> write free form or 
> semi-structured text and then pass through filters
> to derive structure 
> and relationships. Whereas, I think we can achieve
> far more accurate 
> reporting of people's ideas if we get them to enter
> their relationships 
> as they enter the actual data. Is their even a
> difference between 
> relationship and data? Could they collectively be
> called concepts?
> 
> I'm concerned that the mined data looses it's link
> with the original 
> data and so traceability and hence accountability
> are lost. I want to 
> have a system is far more fluid and cohesive than
> what we've seen so 
> far. I want people to be able to make links between
> ideas and objects 
> and hence build the structure. Test1 gives us a bit
> of that, no?
> 
> Actually, there's no reason why one shouldn't first
> enter free form 
> text or semi-structured data or very-structured data
> or just 
> structure/relationships and then go back later and
> add in or modify or 
> filter whatever was needed. But the important thing
> is that the data 
> mining is an inherent part of the system and is
> logged and tracked just 
> as the data and structure, yes?
> 
> OK. Are we in agreement now? I think I've just press
> the all-inclusive 
> button! ;-)
> 
> Off topic: I have a lot to say about what Cory and
> Matthew wrote but 
> I'll leave it for now. Suffice to say that if one
> wishes to view the 
> web as serendipitous then I hope it assists one in
> creating what one 
> wants to create. It doesn't work for me as a tool.
> 
> Cheers Daniel
> 
> On Friday, January 31, 2003, at 08:49  pm, Neil
> Harris wrote:
> > Here's a wonderful critique of metadata by Cory
> Doctorow.
> > http://www.well.com/~doctorow/metacrap.htm
> 
> On Friday, January 31, 2003, at 08:54  pm, Neil
> Harris wrote:
> > ...and this is a reaction to it.  What I think is
> great here is the 
> > idea of data mining semi-structured data, which
> this author suggests 
> > as the answer to Doctorow's criticisms. This seems
> to fit with the 
> > Kendra ideal...
> > http://mpt.phrasewise.com/2003/01/27#a446
> 


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com