Re: Description of the Journaling topology
Matthew Dillon wrote:
>:Barely understanding the implication of this concept it strikes me
>:mostly logical, clean and relative simple.
>:Which makes me curious why other project haven't done this already?
>:What is the major reason that other project follow a different path then
>:this one?
>:
>:--
>:mph
>
> The concepts aren't new but my recollection is that most journaling
> implementations are directly integrated into the filesystem and this
> tends to limit their flexibility. Making the journaling a kernel
> layer and taking into account forward-looking goals really opens up
> the possibilities. Forward-looking is not something that people are
> generally good at in either the open-source or the commercial world.
> (proof of concept: why ext3 is such a mess, why existing journaling
> implementations are so limited in scope).
>
>
>
So, is this meta-data journaling? Or would there be data journaling as well?
Would meta-data and data be in the same stream, or would there separate
streams for each?
Separate streams would lead to sych problems?
> Generally speaking open-source OS projects have been severely lacking
> with regards to the construction of better backup paradigms, mostly
> relying on hardware (e.g. RAID) and external technologies (e.g. NetApp),
> or relying on major assumptions with regards to disk data reliability
> (which are no longer true) (e.g. Ext3Fs, Reiser), or block-level
> snapshots (softupdates) which are cludgy. External utilities like
> dump and tar have no realtime capabilities whatsoever and aren't even
> reliable when used as designed if the filesystem is being modified
> while a dump/tar is in progress.
>
>
>
Some of the NAS solutions have gone to dressed up version of "dump and
restore" for backups.
This feels like a step backwards if you're accustomed to having a
database track which individual
objects can be restored, and to what date, etc.
Backup solutions that watch for changes in a file while the backup is
progressing are a pain as
well. Can take a long time to stat all the objects in a big file system too.
> None of these integrated technologies really give me any peace of mind.
> My number one desire is to have a technology that can give the sysop
> actual peace of mind that his systems aren't going to crash and burn
> beyond any chance of recovery, be it through a software bug, disk crash,
> building fire, or intentional destruction (hackers).
>
>
>
Would people stand for a VM system where the SA had to periodically pick
which pages to swap
to disk? Why do people stand a system where the SA has to periodically
pick which files get put
on tape?
There are cycles to spare. The OS should take care of it. (?)
> Our journaling layer is designed to address these issues. Providing a
> high level filesystem operations change stream off-site is far more
> robust then providing a block device level change stream. Being able
> to go off-site in real-time to a secure (or more secure) machine can't
> be beat. Being able to rewind the journal to any point in time,
> infinitely fine-grained, gives security managers and sysops and even
> users an incredibly powerful tool for deconstructing security events
> (e.g. log file erasures), recovering lost data, and so on and so forth.
> These are very desireable traits, yah?
>
>
>
Maybe a heuristic based daemon that watches the journal stream and
creates synthetic quiesce points?
The SA would be able to leave a tape in the drive, or a DVDR in the
drive, and the system could
periodically write out changes.
Binaries that get modified could also trigger an alarm, ala tripwire,
etc. Depends on how people feel about
adding more metatdata to the filesystems.
Apps could periodically send a "I"m quiesced" signal to the fs as well.
As opposed to a checkpoint signal
coming from the outside.
> --
>
> So why hasn't it been done or, at least, why isn't it unversal after all
> these years?
>
>
>
People afraid to work on solutions that don't fit within the existing
problem space?
> It's a good question. I think it comes down to how most programmers
> have been educated over the years. Its funny, but whenever I build
> something new the first question I usually get is "what paper is your
> work based on?". I get it every time, without fail. And every time,
> without fail, I find myself trying to explain to the questioner that
> I generally do not bother to *READ* research papers... that I build
> systems from scratch based on one or two sentence's worth of concept.
>
> If I really want to throw someone for a loop I ask him whether he'd
> rather be the guy inventing the algorithm and writing the paper, or
> the guy implementing it from the paper. It's a question that forces
> the questioner to actually think with his noggin.
>
> I think that is really the crux of the problem... programmers have been
> taught to build things from templates rather then build things from
> concepts... and THAT is primarily why software is still stuck in the
> dark ages insofar as I am concerned. True innovation requires having
> lightbulbs go off above your head all the time, and you don't get that
> from reading papers. Another amusing anecdote... every time I complained
> about something in FreeBSD-5 or 6 the universal answer I got was that
> 'oh, well, Solaris did it this way' or 'there was a paper about this'
> or a myrid of other 'someone else wrote it down so it must be good'
> excuses. Not once did I ever get any other answer. Pretty sad, I think,
> and also sadly not unique to FreeBSD. It's a problem with mindset, and
> mindset is a problem with our educational system (the entire world's).
>
> I'm really happy that DragonFly has finally progressed to the point where
> we can begin to implement our loftier goals. Up until now the work has
> been primarily ripping out and reimplementing the guts of the system with
> very little visibility poking through to the end-user. Now we are
> are starting to push into things that have direct consequences to the
> end-user. The journaling is one of the three major legs that will
> support the ultimate goal of single-system-image clustering. The second
> leg is a cache coherency scheme, and the third will be resource sharing
> and migration. All three will have to be very carefully and deliberately
> integrated together into a single whole to achieve the ultimate goal.
>
> This makes journaling a major turning point for the project... one,
> I hope, that attracts more people to DragonFly.
>
>
>
Hopefully the train doesn't get going so fast that people can't get on.
> -Matt
>
>
>
>
討論串 (同標題文章)
完整討論串 (本文為第 38 之 42 篇):