Skip site navigation (1)Skip section navigation (2)

CVS log for src/sys/ufs/ffs/ffs_snapshot.c

[BACK] Up to [FreeBSD] / src / sys / ufs / ffs

Request diff between arbitrary revisions


Keyword substitution: kv
Default branch: MAIN


Revision 1.157.2.4: download - view: text, markup, annotated - select for diffs
Sun Jan 29 08:03:45 2012 UTC (12 days, 19 hours ago) by mckusick
Branches: RELENG_9
Diff to: previous 1.157.2.3: preferred, colored; branchpoint 1.157: preferred, colored; next MAIN 1.158: preferred, colored
Changes since revision 1.157.2.3: +9 -1 lines
SVN rev 230725 on 2012-01-29 08:03:45Z by mckusick

MFC r230249:

Make sure all intermediate variables holding mount flags (mnt_flag)
and that all internal kernel calls passing mount flags are declared
as uint64_t so that flags in the top 32-bits are not lost.

MFC r230250:

There are several bugs/hangs when trying to take a snapshot on a UFS/FFS
filesystem running with journaled soft updates. Until these problems
have been tracked down, return ENOTSUPP when an attempt is made to
take a snapshot on a filesystem running with journaled soft updates.

Revision 1.160: download - view: text, markup, annotated - select for diffs
Tue Jan 17 01:14:56 2012 UTC (3 weeks, 4 days ago) by mckusick
Branches: MAIN
CVS tags: HEAD
Diff to: previous 1.159: preferred, colored
Changes since revision 1.159: +9 -1 lines
SVN rev 230250 on 2012-01-17 01:14:56Z by mckusick

There are several bugs/hangs when trying to take a snapshot on a UFS/FFS
filesystem running with journaled soft updates. Until these problems
have been tracked down, return ENOTSUPP when an attempt is made to
take a snapshot on a filesystem running with journaled soft updates.

MFC after: 2 weeks

Revision 1.157.2.3.2.1: download - view: text, markup, annotated - select for diffs
Fri Nov 11 04:20:22 2011 UTC (3 months ago) by kensmith
Branches: RELENG_9_0
CVS tags: RELENG_9_0_0_RELEASE
Diff to: previous 1.157.2.3: preferred, colored; next MAIN 1.157.2.4: preferred, colored
Changes since revision 1.157.2.3: +0 -0 lines
SVN rev 227445 on 2011-11-11 04:20:22Z by kensmith

Copy stable/9 to releng/9.0 as part of the FreeBSD 9.0-RELEASE release
cycle.

Approved by:	re (implicit)

Revision 1.157.2.3: download - view: text, markup, annotated - select for diffs
Wed Sep 28 19:38:47 2011 UTC (4 months, 1 week ago) by mckusick
Branches: RELENG_9
CVS tags: RELENG_9_0_BP
Branch point for: RELENG_9_0
Diff to: previous 1.157.2.2: preferred, colored; branchpoint 1.157: preferred, colored
Changes since revision 1.157.2.2: +24 -21 lines
SVN rev 225851 on 2011-09-28 19:38:47Z by mckusick

MFC r225807:
This update eliminates a lock-order reversal warning discovered
whle tracking down the system hang reported in kern/160662 and
corrected in revision 225806 (MFC'ed as 225850). The LOR is not
the cause of the system hang and indeed cannot cause an actual
deadlock. However, it can be easily eliminated by defering the
acquisition of a buflock until after all the vnode locks have
been acquired.

As journaled soft updates first appeared in 9.0, this will be the
only MFC of this change.

Approved by:     re (kib)
Reported by:     Hans Ottevanger
PR:              kern/160662

Revision 1.157.2.2: download - view: text, markup, annotated - select for diffs
Wed Sep 28 19:36:21 2011 UTC (4 months, 1 week ago) by mckusick
Branches: RELENG_9
Diff to: previous 1.157.2.1: preferred, colored; branchpoint 1.157: preferred, colored
Changes since revision 1.157.2.1: +1 -1 lines
SVN rev 225850 on 2011-09-28 19:36:21Z by mckusick

MFC: r225806:
This update eliminates the system hang reported in kern/160662 when
taking a snapshot on a filesystem running with journaled soft updates.

As journaled soft updates first appeared in 9.0, this will be the
only MFC of this change.

Approved by:     re (kib)
Reported by:     Hans Ottevanger
Fix verified by: Hans Ottevanger
PR:              kern/160662

Revision 1.159: download - view: text, markup, annotated - select for diffs
Tue Sep 27 17:41:48 2011 UTC (4 months, 2 weeks ago) by mckusick
Branches: MAIN
Diff to: previous 1.158: preferred, colored
Changes since revision 1.158: +24 -21 lines
SVN rev 225807 on 2011-09-27 17:41:48Z by mckusick

This update eliminates a lock-order reversal warning discovered
whle tracking down the system hang reported in kern/160662 and
corrected in revision 225806. The LOR is not the cause of the system
hang and indeed cannot cause an actual deadlock. However, it can
be easily eliminated by defering the acquisition of a buflock until
after all the vnode locks have been acquired.

Reported by:     Hans Ottevanger
PR:              kern/160662

Revision 1.158: download - view: text, markup, annotated - select for diffs
Tue Sep 27 17:34:02 2011 UTC (4 months, 2 weeks ago) by mckusick
Branches: MAIN
Diff to: previous 1.157: preferred, colored
Changes since revision 1.157: +1 -1 lines
SVN rev 225806 on 2011-09-27 17:34:02Z by mckusick

This update eliminates the system hang reported in kern/160662 when
taking a snapshot on a filesystem running with journaled soft updates.

Reported by:     Hans Ottevanger
Fix verified by: Hans Ottevanger
PR:              kern/160662

Revision 1.157.2.1: download - view: text, markup, annotated - select for diffs
Fri Sep 23 00:51:37 2011 UTC (4 months, 2 weeks ago) by kensmith
Branches: RELENG_9
Diff to: previous 1.157: preferred, colored
Changes since revision 1.157: +0 -0 lines
SVN rev 225736 on 2011-09-23 00:51:37Z by kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by:	re (implicit)

Revision 1.157: download - view: text, markup, annotated - select for diffs
Sat Jun 18 21:10:03 2011 UTC (7 months, 3 weeks ago) by mckusick
Branches: MAIN
CVS tags: RELENG_9_BP
Branch point for: RELENG_9
Diff to: previous 1.156: preferred, colored
Changes since revision 1.156: +2 -1 lines
SVN rev 223268 on 2011-06-18 21:10:03Z by mckusick

Fixed dereference of a NULL pointer.

Reported by:	Peter Holm

Revision 1.156: download - view: text, markup, annotated - select for diffs
Wed Jun 15 23:19:09 2011 UTC (7 months, 3 weeks ago) by mckusick
Branches: MAIN
Diff to: previous 1.155: preferred, colored
Changes since revision 1.155: +44 -20 lines
SVN rev 223127 on 2011-06-15 23:19:09Z by mckusick

Ensure that filesystem metadata contained within persistent snapshots
is always kept consistent.

Suggested by:	Jeff Roberson

Revision 1.155: download - view: text, markup, annotated - select for diffs
Sun Jun 12 19:27:05 2011 UTC (8 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.154: preferred, colored
Changes since revision 1.154: +63 -15 lines
SVN rev 223020 on 2011-06-12 19:27:05Z by mckusick

Update to soft updates journaling to properly track freed blocks
that get claimed by snapshots.

Submitted by:	Jeff Roberson
Tested by:	Peter Holm

Revision 1.154: download - view: text, markup, annotated - select for diffs
Wed Feb 9 15:33:13 2011 UTC (12 months ago) by netchild
Branches: MAIN
Diff to: previous 1.153: preferred, colored
Changes since revision 1.153: +1 -0 lines
SVN rev 218485 on 2011-02-09 15:33:13Z by netchild

Add some FEATURE macros for some UFS features.

SU+J is not included as a FEATURE macro:
 - it was not in the tree during the GSoC
 - I do not see an option to en-/disable it in NOTES

Two minor changes where made during the review compared to what was developed
during GSoC 2010.

No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.

Sponsored by:	Google Summer of Code 2010
Submitted by:	kibab
Reviewed by:	kib
X-MFC after:	to be determined in last commit with code from this project

Revision 1.136.2.6.4.1: download - view: text, markup, annotated - select for diffs
Tue Dec 21 17:10:29 2010 UTC (13 months, 3 weeks ago) by kensmith
Branches: RELENG_7_4
CVS tags: RELENG_7_4_0_RELEASE
Diff to: previous 1.136.2.6: preferred, colored; next MAIN 1.137: preferred, colored
Changes since revision 1.136.2.6: +0 -0 lines
SVN rev 216618 on 2010-12-21 17:10:29Z by kensmith

Copy stable/7 to releng/7.4 in preparation for FreeBSD-7.4 release.

Approved by:	re (implicit)

Revision 1.150.2.2.4.1: download - view: text, markup, annotated - select for diffs
Tue Dec 21 17:09:25 2010 UTC (13 months, 3 weeks ago) by kensmith
Branches: RELENG_8_2
CVS tags: RELENG_8_2_0_RELEASE
Diff to: previous 1.150.2.2: preferred, colored; next MAIN 1.151: preferred, colored
Changes since revision 1.150.2.2: +0 -0 lines
SVN rev 216617 on 2010-12-21 17:09:25Z by kensmith

Copy stable/8 to releng/8.2 in preparation for FreeBSD-8.2 release.

Approved by:	re (implicit)

Revision 1.150.2.2.2.1: download - view: text, markup, annotated - select for diffs
Mon Jun 14 02:09:06 2010 UTC (19 months, 4 weeks ago) by kensmith
Branches: RELENG_8_1
CVS tags: RELENG_8_1_0_RELEASE
Diff to: previous 1.150.2.2: preferred, colored; next MAIN 1.151: preferred, colored
Changes since revision 1.150.2.2: +0 -0 lines
SVN rev 209145 on 2010-06-14 02:09:06Z by kensmith

Copy stable/8 to releng/8.1 in preparation for 8.1-RC1.

Approved by:	re (implicit)

Revision 1.153: download - view: text, markup, annotated - select for diffs
Fri May 7 08:45:21 2010 UTC (21 months ago) by jeff
Branches: MAIN
Diff to: previous 1.152: preferred, colored
Changes since revision 1.152: +8 -0 lines
SVN rev 207742 on 2010-05-07 08:45:21Z by jeff

 - Call softdep_prealloc() before any of the balloc routines in the
   snapshot code.
 - Don't fsync() vnodes in prealloc if copy on write is in progress.  It
   is not safe to recurse back into the write path here.

Reported by:	Vladimir Grebenschikov <vova@fbsd.ru>

Revision 1.152: download - view: text, markup, annotated - select for diffs
Sat Apr 24 07:05:35 2010 UTC (21 months, 2 weeks ago) by jeff
Branches: MAIN
Diff to: previous 1.151: preferred, colored
Changes since revision 1.151: +49 -17 lines
SVN rev 207141 on 2010-04-24 07:05:35Z by jeff

 - Merge soft-updates journaling from projects/suj/head into head.  This
   brings in support for an optional intent log which eliminates the need
   for background fsck on unclean shutdown.

Sponsored by:   iXsystems, Yahoo!, and Juniper.
With help from: McKusick and Peter Holm

Revision 1.150.2.2: download - view: text, markup, annotated - select for diffs
Thu Feb 11 18:34:06 2010 UTC (23 months, 4 weeks ago) by mjacob
Branches: RELENG_8
CVS tags: RELENG_8_2_BP, RELENG_8_1_BP
Branch point for: RELENG_8_2, RELENG_8_1
Diff to: previous 1.150.2.1: preferred, colored; branchpoint 1.150: preferred, colored; next MAIN 1.151: preferred, colored
Changes since revision 1.150.2.1: +1 -1 lines
SVN rev 203786 on 2010-02-11 18:34:06Z by mjacob

MFC a number of changes from head for ISP (203478,203463,203444,202418,201758,
201408,201325,200089,198822,197373,197372,197214,196162). Since one of those
changes was a semicolon cleanup from somebody else, this touches a lot more.

Revision 1.136.2.6.2.1: download - view: text, markup, annotated - select for diffs
Wed Feb 10 00:26:20 2010 UTC (2 years ago) by kensmith
Branches: RELENG_7_3
CVS tags: RELENG_7_3_0_RELEASE
Diff to: previous 1.136.2.6: preferred, colored; next MAIN 1.137: preferred, colored
Changes since revision 1.136.2.6: +0 -0 lines
SVN rev 203736 on 2010-02-10 00:26:20Z by kensmith

Copy stable/7 to releng/7.3 as part of the 7.3-RELEASE process.

Approved by:	re (implicit)

Revision 1.151: download - view: text, markup, annotated - select for diffs
Thu Jan 7 21:01:37 2010 UTC (2 years, 1 month ago) by mbr
Branches: MAIN
Diff to: previous 1.150: preferred, colored
Changes since revision 1.150: +1 -1 lines
SVN rev 201758 on 2010-01-07 21:01:37Z by mbr

Remove extraneous semicolons, no functional changes.

Submitted by:	Marc Balmer <marc@msys.ch>
MFC after:	1 week

Revision 1.150.2.1.2.1: download - view: text, markup, annotated - select for diffs
Sun Oct 25 01:10:29 2009 UTC (2 years, 3 months ago) by kensmith
Branches: RELENG_8_0
CVS tags: RELENG_8_0_0_RELEASE
Diff to: previous 1.150.2.1: preferred, colored; next MAIN 1.150.2.2: preferred, colored
Changes since revision 1.150.2.1: +0 -0 lines
SVN rev 198460 on 2009-10-25 01:10:29Z by kensmith

Copy stable/8 to releng/8.0 as part of 8.0-RELEASE release procedure.

Approved by:	re (implicit)

Revision 1.150.2.1: download - view: text, markup, annotated - select for diffs
Mon Aug 3 08:13:06 2009 UTC (2 years, 6 months ago) by kensmith
Branches: RELENG_8
CVS tags: RELENG_8_0_BP
Branch point for: RELENG_8_0
Diff to: previous 1.150: preferred, colored
Changes since revision 1.150: +0 -0 lines
SVN rev 196045 on 2009-08-03 08:13:06Z by kensmith

Copy head to stable/8 as part of 8.0 Release cycle.

Approved by:	re (Implicit)

Revision 1.136.2.6: download - view: text, markup, annotated - select for diffs
Wed May 20 23:34:59 2009 UTC (2 years, 8 months ago) by kmacy
Branches: RELENG_7
CVS tags: RELENG_7_4_BP, RELENG_7_3_BP
Branch point for: RELENG_7_4, RELENG_7_3
Diff to: previous 1.136.2.5: preferred, colored; branchpoint 1.136: preferred, colored; next MAIN 1.137: preferred, colored
Changes since revision 1.136.2.5: +2 -1 lines
SVN rev 192498 on 2009-05-20 23:34:59Z by kmacy

MFC ZFS version 13. This includes the changes by pjd (see original message
below) as well as the following:

- the recurring deadlock was fixed by deferring vinactive to a dedicated thread

- zfs boot for all pool types now works
      Submitted by: dfr

- kmem now goes up to 512GB so arc is now limited by physmem

- the arc now experiences backpressure from the vm (which can be too
much - but this allows ZFS to work without any tunables on amd64)

- frequently recurring LOR in the ARC fixed

- zfs send coredump fix

- fixes for various PRs

Supported by: Barrett Lyon, BitGravity

Revision 185029 - (view) (annotate) - [select for diffs]
Modified Mon Nov 17 20:49:29 2008 UTC (6 months ago) by pjd
File length: 38244 byte(s)
Diff to previous 177698

Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes.

This bring huge amount of changes, I'll enumerate only user-visible changes:

- Delegated Administration

       Allows regular users to perform ZFS operations, like file system
       creation, snapshot creation, etc.

- L2ARC

       Level 2 cache for ZFS - allows to use additional disks for cache.
       Huge performance improvements mostly for random read of mostly
       static content.

- slog

       Allow to use additional disks for ZFS Intent Log to speed up
       operations like fsync(2).

- vfs.zfs.super_owner

       Allows regular users to perform privileged operations on files stored
       on ZFS file systems owned by him. Very careful with this one.

- chflags(2)

       Not all the flags are supported. This still needs work.

- ZFSBoot

       Support to boot off of ZFS pool. Not finished, AFAIK.

       Submitted by:   dfr

- Snapshot properties

- New failure modes

       Before if write requested failed, system paniced. Now one
       can select from one of three failure modes:

       Before if write requested failed, system paniced. Now one
       can select from one of three failure modes:
       - panic - panic on write error
       - wait - wait for disk to reappear
       - continue - serve read requests if possible, block write requests

- Refquota, refreservation properties

       Just quota and reservation properties, but don't count space consumed
       by children file systems, clones and snapshots.

 - Sparse volumes

       ZVOLs that don't reserve space in the pool.

 - External attributes

       Compatible with extattr(2).

 - NFSv4-ACLs

       Not sure about the status, might not be complete yet.

       Submitted by:   trasz

 - Creation-time properties

 - Regression tests for zpool(8) command.

 Obtained from:        OpenSolaris

Revision 1.136.2.5: download - view: text, markup, annotated - select for diffs
Fri May 15 19:54:19 2009 UTC (2 years, 8 months ago) by jhb
Branches: RELENG_7
Diff to: previous 1.136.2.4: preferred, colored; branchpoint 1.136: preferred, colored
Changes since revision 1.136.2.4: +2 -2 lines
SVN rev 192154 on 2009-05-15 19:54:19Z by jhb

MFC: Adjust some variables (mostly related to the buffer cache) that hold
address space sizes to be longs instead of ints.  This includes an ABI
compat shim for the kern.bufspace sysctl for old binaries.

Revision 1.136.2.4.2.1: download - view: text, markup, annotated - select for diffs
Wed Apr 15 03:14:26 2009 UTC (2 years, 9 months ago) by kensmith
Branches: RELENG_7_2
CVS tags: RELENG_7_2_0_RELEASE
Diff to: previous 1.136.2.4: preferred, colored; next MAIN 1.136.2.5: preferred, colored
Changes since revision 1.136.2.4: +0 -0 lines
SVN rev 191087 on 2009-04-15 03:14:26Z by kensmith

Create releng/7.2 from stable/7 in preparation for 7.2-RELEASE.

Approved by:	re (implicit)

Revision 1.150: download - view: text, markup, annotated - select for diffs
Fri Apr 10 10:52:19 2009 UTC (2 years, 10 months ago) by rwatson
Branches: MAIN
CVS tags: RELENG_8_BP
Branch point for: RELENG_8
Diff to: previous 1.149: preferred, colored
Changes since revision 1.149: +0 -1 lines
SVN rev 190888 on 2009-04-10 10:52:19Z by rwatson

Remove VOP_LEASE and supporting functions.  This hasn't been used since
the removal of NQNFS, but was left in in case it was required for NFSv4.
Since our new NFSv4 client and server can't use it for their
requirements, GC the old mechanism, as well as other unused lease-
related code and interfaces.

Due to its impact on kernel programming and binary interfaces, this
change should not be MFC'd.

Proposed by:    jeff
Reviewed by:    jeff
Discussed with: rmacklem, zach loafman @ isilon

Revision 1.149: download - view: text, markup, annotated - select for diffs
Mon Mar 9 19:35:20 2009 UTC (2 years, 11 months ago) by jhb
Branches: MAIN
Diff to: previous 1.148: preferred, colored
Changes since revision 1.148: +2 -2 lines
SVN rev 189595 on 2009-03-09 19:35:20Z by jhb

Adjust some variables (mostly related to the buffer cache) that hold
address space sizes to be longs instead of ints.  Specifically, the follow
values are now longs: runningbufspace, bufspace, maxbufspace,
bufmallocspace, maxbufmallocspace, lobufspace, hibufspace, lorunningspace,
hirunningspace, maxswzone, maxbcache, and maxpipekva.  Previously, a
relatively small number (~ 44000) of buffers set in kern.nbuf would result
in integer overflows resulting either in hangs or bogus values of
hidirtybuffers and lodirtybuffers.  Now one has to overflow a long to see
such problems.  There was a check for a nbuf setting that would cause
overflows in the auto-tuning of nbuf.  I've changed it to always check and
cap nbuf but warn if a user-supplied tunable would cause overflow.

Note that this changes the ABI of several sysctls that are used by things
like top(1), etc., so any MFC would probably require a some gross shims
to allow for that.

MFC after:	1 month

Revision 1.136.2.4: download - view: text, markup, annotated - select for diffs
Sat Feb 14 23:02:21 2009 UTC (2 years, 11 months ago) by kib
Branches: RELENG_7
CVS tags: RELENG_7_2_BP
Branch point for: RELENG_7_2
Diff to: previous 1.136.2.3: preferred, colored; branchpoint 1.136: preferred, colored
Changes since revision 1.136.2.3: +5 -2 lines
SVN rev 188620 on 2009-02-14 23:02:21Z by kib

MFC r183073:
When attempt is made to suspend a filesystem that is already syspended,
wait until the current suspension is lifted instead of silently returning
success immediately. The consequences of calling vfs_write() resume when
not owning the suspension are not well-defined at best.

Add the vfs_susp_clean() mount method to be called from
vfs_write_resume(). Set it to process_deferred_inactive() for ffs, and
stop calling it manually.

Add the thread flag TDP_IGNSUSP that allows to bypass the suspension
point in the vn_start_write. It is intended for use by VFS in the
situations where the suspender want to do some i/o requiring calls to
vn_start_write(), and this i/o cannot be done later.

Note that addition of the mount method and new struct mount field change
the KBI. This was approved by re and no objections on stable@ were
raised.

Revision 1.136.2.3.2.1: download - view: text, markup, annotated - select for diffs
Tue Nov 25 02:59:29 2008 UTC (3 years, 2 months ago) by kensmith
Branches: RELENG_7_1
CVS tags: RELENG_7_1_0_RELEASE
Diff to: previous 1.136.2.3: preferred, colored; next MAIN 1.136.2.4: preferred, colored
Changes since revision 1.136.2.3: +0 -0 lines
SVN rev 185281 on 2008-11-25 02:59:29Z by kensmith

Create releng/7.1 in preparation for moving into RC phase of 7.1 release
cycle.

Approved by:	re (implicit)

Revision 1.136.2.3: download - view: text, markup, annotated - select for diffs
Tue Nov 18 18:21:36 2008 UTC (3 years, 2 months ago) by ambrisko
Branches: RELENG_7
CVS tags: RELENG_7_1_BP
Branch point for: RELENG_7_1
Diff to: previous 1.136.2.2: preferred, colored; branchpoint 1.136: preferred, colored
Changes since revision 1.136.2.2: +4 -0 lines
SVN rev 185054 on 2008-11-18 18:21:36Z by ambrisko

MFC 184934:

For now on every 10 cyclinder groups flush the buffer cache to free
up space.  If the buffer cache fills up then the disk systems can
grind to a halt.

PR:		128832
Approved by:	re (kensmith)

Revision 1.148: download - view: text, markup, annotated - select for diffs
Thu Nov 13 17:40:21 2008 UTC (3 years, 2 months ago) by ambrisko
Branches: MAIN
Diff to: previous 1.147: preferred, colored
Changes since revision 1.147: +4 -0 lines
SVN rev 184934 on 2008-11-13 17:40:21Z by ambrisko

For now on every 10 cyclinder groups flush the buffer cache to free
up space.  If the buffer cache fills up then the disk systems can
grind to a halt.  Better tuning can be figured out later.

Tested by:	Tim, others and work
Reviewed by:	Kostik Belousov
PR:		128832

Revision 1.147: download - view: text, markup, annotated - select for diffs
Thu Oct 23 15:53:51 2008 UTC (3 years, 3 months ago) by des
Branches: MAIN
Diff to: previous 1.146: preferred, colored
Changes since revision 1.146: +16 -16 lines
SVN rev 184205 on 2008-10-23 15:53:51Z by des

Retire the MALLOC and FREE macros.  They are an abomination unto style(9).

MFC after:	3 months

Revision 1.136.2.2: download - view: text, markup, annotated - select for diffs
Mon Oct 20 16:44:59 2008 UTC (3 years, 3 months ago) by kib
Branches: RELENG_7
Diff to: previous 1.136.2.1: preferred, colored; branchpoint 1.136: preferred, colored
Changes since revision 1.136.2.1: +7 -0 lines
SVN rev 184080 on 2008-10-20 16:44:59Z by kib

MFC r183822:
Sync up summary information for cylinder groups while data is already
in memory during snapshot creation.

Approved by:	re (kensmith)

Revision 1.146: download - view: text, markup, annotated - select for diffs
Mon Oct 13 14:05:01 2008 UTC (3 years, 3 months ago) by kib
Branches: MAIN
Diff to: previous 1.145: preferred, colored
Changes since revision 1.145: +7 -0 lines
SVN rev 183822 on 2008-10-13 14:05:01Z by kib

Sync up summary information for cylinder groups while data is already
in memory during snapshot creation. This improves the results of the
background fsck.

Submitted by: tegge
MFC after: 1 week

Revision 1.103.2.26.2.1: download - view: text, markup, annotated - select for diffs
Thu Oct 2 02:57:24 2008 UTC (3 years, 4 months ago) by kensmith
Branches: RELENG_6_4
CVS tags: RELENG_6_4_0_RELEASE
Diff to: previous 1.103.2.26: preferred, colored; next MAIN 1.104: preferred, colored
Changes since revision 1.103.2.26: +0 -0 lines
SVN rev 183531 on 2008-10-02 02:57:24Z by kensmith

Create releng/6.4 from stable/6 in preparation for 6.4-RC1.

Approved by:	re (implicit)

Revision 1.145: download - view: text, markup, annotated - select for diffs
Tue Sep 16 11:51:06 2008 UTC (3 years, 4 months ago) by kib
Branches: MAIN
Diff to: previous 1.144: preferred, colored
Changes since revision 1.144: +5 -2 lines
SVN rev 183073 on 2008-09-16 11:51:06Z by kib

When attempt is made to suspend a filesystem that is already syspended,
wait until the current suspension is lifted instead of silently returning
success immediately. The consequences of calling vfs_write() resume when
not owning the suspension are not well-defined at best.

Add the vfs_susp_clean() mount method to be called from
vfs_write_resume(). Set it to process_deferred_inactive() for ffs, and
stop calling it manually.

Add the thread flag TDP_IGNSUSP that allows to bypass the suspension
point in the vn_start_write. It is intended for use by VFS in the
situations where the suspender want to do some i/o requiring calls to
vn_start_write(), and this i/o cannot be done later.

Reviewed by:	tegge
In collaboration with:	pho
MFC after:	 1 month

Revision 1.144: download - view: text, markup, annotated - select for diffs
Thu Aug 28 15:23:18 2008 UTC (3 years, 5 months ago) by attilio
Branches: MAIN
Diff to: previous 1.143: preferred, colored
Changes since revision 1.143: +1 -1 lines
SVN rev 182371 on 2008-08-28 15:23:18Z by attilio

Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>

Revision 1.143: download - view: text, markup, annotated - select for diffs
Mon Mar 31 12:01:21 2008 UTC (3 years, 10 months ago) by kib
Branches: MAIN
Diff to: previous 1.142: preferred, colored
Changes since revision 1.142: +1 -0 lines
Add the support for the AT_FDCWD and fd-relative name lookups to the
namei(9).

Based on the submission by rdivacky,
	sponsored by Google Summer of Code 2007
Reviewed by:	rwatson, rdivacky
Tested by:	pho

Revision 1.142: download - view: text, markup, annotated - select for diffs
Mon Mar 31 07:47:08 2008 UTC (3 years, 10 months ago) by jeff
Branches: MAIN
Diff to: previous 1.141: preferred, colored
Changes since revision 1.141: +109 -67 lines
 - Don't free snapdata structures when they are no longer in use.
   Keeping the lockmgr lock valid allows us to switch the v_lock pointer
   in snapshot vnodes between the embedded lockmgr lock and snapdata
   lock without needing the vnode interlock to protect against races
 - Keep unused snapdata structures in a list.
 - Add a function to lock the devvp and allocate a snapdata to it or
   acquire a new one without races.  The old function was safe from
   creation races because we set the mount flag when creating snapshots
   and thus serializing them.  However, it might have been subject to
   destroying races.

Reviewed by:	tegge

Revision 1.141: download - view: text, markup, annotated - select for diffs
Wed Mar 19 06:19:01 2008 UTC (3 years, 10 months ago) by jeff
Branches: MAIN
Diff to: previous 1.140: preferred, colored
Changes since revision 1.140: +0 -4 lines
 - Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice from
   requiring the per-process spinlock to only requiring the process lock.
 - Reflect these changes in the proc.h documentation and consumers throughout
   the kernel.  This is a substantial reduction in locking cost for these
   fields and was made possible by recent changes to threading support.

Revision 1.103.2.26: download - view: text, markup, annotated - select for diffs
Mon Feb 25 09:52:12 2008 UTC (3 years, 11 months ago) by obrien
Branches: RELENG_6
CVS tags: RELENG_6_4_BP
Branch point for: RELENG_6_4
Diff to: previous 1.103.2.25: preferred, colored; branchpoint 1.103: preferred, colored; next MAIN 1.104: preferred, colored
Changes since revision 1.103.2.25: +2 -2 lines
MFC: use *_EMPTY macros when appropriate.

Revision 1.103.2.25: download - view: text, markup, annotated - select for diffs
Mon Feb 25 06:30:23 2008 UTC (3 years, 11 months ago) by obrien
Branches: RELENG_6
Diff to: previous 1.103.2.24: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.24: +1 -1 lines
MFC: Turn most ffs 'DIAGNOSTIC's into INVARIANTS.

Revision 1.136.2.1: download - view: text, markup, annotated - select for diffs
Fri Feb 15 16:43:02 2008 UTC (3 years, 11 months ago) by obrien
Branches: RELENG_7
Diff to: previous 1.136: preferred, colored
Changes since revision 1.136: +1 -1 lines
MFC: Turn most ffs 'DIAGNOSTIC's into INVARIANTS.

Revision 1.140: download - view: text, markup, annotated - select for diffs
Thu Jan 24 12:34:29 2008 UTC (4 years ago) by attilio
Branches: MAIN
Diff to: previous 1.139: preferred, colored
Changes since revision 1.139: +20 -26 lines
Cleanup lockmgr interface and exported KPI:
- Remove the "thread" argument from the lockmgr() function as it is
  always curthread now
- Axe lockcount() function as it is no longer used
- Axe LOCKMGR_ASSERT() as it is bogus really and no currently used.
  Hopefully this will be soonly replaced by something suitable for it.
- Remove the prototype for dumplockinfo() as the function is no longer
  present

Addictionally:
- Introduce a KASSERT() in lockstatus() in order to let it accept only
  curthread or NULL as they should only be passed
- Do a little bit of style(9) cleanup on lockmgr.h

KPI results heavilly broken by this change, so manpages and
FreeBSD_version will be modified accordingly by further commits.

Tested by: matteo

Revision 1.139: download - view: text, markup, annotated - select for diffs
Sun Jan 13 14:44:13 2008 UTC (4 years ago) by attilio
Branches: MAIN
Diff to: previous 1.138: preferred, colored
Changes since revision 1.138: +14 -14 lines
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>

Revision 1.138: download - view: text, markup, annotated - select for diffs
Thu Jan 10 01:10:56 2008 UTC (4 years, 1 month ago) by attilio
Branches: MAIN
Diff to: previous 1.137: preferred, colored
Changes since revision 1.137: +6 -6 lines
vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by:	Diego Sardina <siarodx at gmail dot com>,
		Andrea Di Pasquale <whyx dot it at gmail dot com>

Revision 1.137: download - view: text, markup, annotated - select for diffs
Thu Nov 8 17:21:50 2007 UTC (4 years, 3 months ago) by obrien
Branches: MAIN
Diff to: previous 1.136: preferred, colored
Changes since revision 1.136: +1 -1 lines
Turn most ffs 'DIAGNOSTIC's into INVARIANTS.

Revision 1.103.2.24: download - view: text, markup, annotated - select for diffs
Mon Jun 11 10:53:48 2007 UTC (4 years, 8 months ago) by kib
Branches: RELENG_6
CVS tags: RELENG_6_3_BP, RELENG_6_3_0_RELEASE, RELENG_6_3
Diff to: previous 1.103.2.23: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.23: +114 -0 lines
MFC:
rev. 1.11 of src/sys/geom/geom_vfs.c
rev. 1.516 of src/sys/kern/vfs_bio.c
rev. 1.35 of src/sys/nfs4client/nfs4_vnops.c
rev. 1.272 of src/sys/nfsclient/nfs_vnops.c
rev. 1.195 of src/sys/sys/buf.h
rev. 1.18 of src/sys/sys/bufobj.h
rev. 1.73 of src/sys/ufs/ffs/ffs_extern.h
rev. 1.133 of src/sys/ufs/ffs/ffs_snapshot.c
rev. 1.324 of src/sys/ufs/ffs/ffs_vfsops.c

Avoid dealing with buffers in bdwrite() that are from other side of
snaplock divisor in the lock order then the buffer being written. Add
new BOP, bop_bdwrite(), to do dirty buffer flushing for same vnode in
the bdwrite(). Default implementation, bufbdflush(), refactors the code
from bdwrite(). For ffs device buffers, specialized implementation is
used.

This commit changes KPI/KBI, thus recompilation of out of tree kernel
modules is required.

Approved by:	re (kensmith)

Revision 1.136: download - view: text, markup, annotated - select for diffs
Tue Jun 5 00:00:56 2007 UTC (4 years, 8 months ago) by jeff
Branches: MAIN
CVS tags: RELENG_7_BP, RELENG_7_0_BP, RELENG_7_0_0_RELEASE, RELENG_7_0
Branch point for: RELENG_7
Diff to: previous 1.135: preferred, colored
Changes since revision 1.135: +15 -9 lines
Commit 14/14 of sched_lock decomposition.
 - Use thread_lock() rather than sched_lock for per-thread scheduling
   sychronization.
 - Use the per-process spinlock rather than the sched_lock for per-process
   scheduling synchronization.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)

Revision 1.103.2.23: download - view: text, markup, annotated - select for diffs
Tue Apr 24 11:08:27 2007 UTC (4 years, 9 months ago) by kib
Branches: RELENG_6
Diff to: previous 1.103.2.22: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.22: +1 -0 lines
MFC rev. 1.135:
Fix the NAMEI zone leak when snapshot was successfully created.

Revision 1.135: download - view: text, markup, annotated - select for diffs
Tue Apr 10 09:31:42 2007 UTC (4 years, 10 months ago) by kib
Branches: MAIN
Diff to: previous 1.134: preferred, colored
Changes since revision 1.134: +1 -0 lines
Fix the NAMEI zone leak when snapshot was successfully created.

Reported and tested by:	Peter Holm
MFC after:		2 weeks

Revision 1.134: download - view: text, markup, annotated - select for diffs
Wed Apr 4 07:29:53 2007 UTC (4 years, 10 months ago) by delphij
Branches: MAIN
Diff to: previous 1.133: preferred, colored
Changes since revision 1.133: +2 -2 lines
Use *_EMPTY macros when appropriate.

Revision 1.103.2.22: download - view: text, markup, annotated - select for diffs
Thu Feb 1 04:45:42 2007 UTC (5 years ago) by mpp
Branches: RELENG_6
Diff to: previous 1.103.2.21: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.21: +1 -13 lines
MFC:  Quota system cleanup & disallow negative ids when accouting for
quota usage.

ffs/ffs_alloc.c rev 1.143
ffs/ffs_snapshot.c rev 1.132
ufs/quota.h rev 1.29
ufs/ufs_quota.c rev 1.86 - 1.88
ufs/ufs_vfsops.c rev 1.48

Revision 1.133: download - view: text, markup, annotated - select for diffs
Tue Jan 23 10:01:18 2007 UTC (5 years ago) by kib
Branches: MAIN
Diff to: previous 1.132: preferred, colored
Changes since revision 1.132: +114 -0 lines
Cylinder group bitmaps and blocks containing inode for a snapshot
file are after snaplock, while other ffs device buffers are before
snaplock in global lock order. By itself, this could cause deadlock
when bdwrite() tries to flush dirty buffers on snapshotted ffs. If,
during the flush, COW activity for snapshot needs to allocate block
and ffs_alloccg() selects the cylinder group that is being written
by bdwrite(), then kernel would panic due to recursive buffer lock
acquision.

Avoid dealing with buffers in bdwrite() that are from other side of
snaplock divisor in the lock order then the buffer being written. Add
new BOP, bop_bdwrite(), to do dirty buffer flushing for same vnode in
the bdwrite(). Default implementation, bufbdflush(), refactors the code
from bdwrite(). For ffs device buffers, specialized implementation is
used.

Reviewed by:	tegge, jeff, Russell Cattelan (cattelan xfs org, xfs changes)
Tested by:	Peter Holm
X-MFC after:	3 weeks (if ever: it changes ABI)

Revision 1.132: download - view: text, markup, annotated - select for diffs
Sat Jan 20 11:58:31 2007 UTC (5 years ago) by mpp
Branches: MAIN
Diff to: previous 1.131: preferred, colored
Changes since revision 1.131: +1 -13 lines
Quota system cleanup.

1) Do not do quota accounting for the actual quota data files
   or for file system snapshot files ("system" files).  This
   prevents a deadlock descibed in PR kern/30958 if the kernel
   ever has to grow the quota file.  Snapshot files were already
   exempt from the quota checks, but this change generalized the check.
2) Fix a cast that caused extremely large uids/gids to incorrectly
   write the quota information to the data file at a truncated
   value for a uint_t32 id value.  The incorrect cast caused quota
   files in this case to be around 4GB in size, with the correct cast
   they can now be 131GB in size.  Also related to PR kern/30958.
3) Check for what appear to be negative UIDs/GIDs and not account
   for them.  This prevents the quota files from becoming 131GB in
   size and causing quotacheck to run forever at bootup.  This could
   also cause the kernel to try and expand the quota file, which might
   deadlock due to the issue in #1.  kern/30958 and kern/38156
   (and some much older closed PR's).
4) With the deadlock problems gone, the kernel can now expand the
   size of the quota database files if it needs to.
5) Pass in the i-node count change value to chkiq and chkiqchg as an
   int, like it used to be before the common routine was split up
   into 2 different routines to increase / decrease the i-node in-use
   count.  Prevents an underflow on the i-node count.  Related
   to PR kern/89247.
6) Prevent the block usage from growing slowly if a file system is
   full and the write was denied due to that fact.  PR kern/89247.

Some of these changes require an updated quotacheck to prevent
the creation of huge (131GB) quota data files (item #3).

#1/#4 probably fixes a lot of the random hangs when quotas are enabled,
possibly some of the jail hangs.

Revision 1.103.2.21: download - view: text, markup, annotated - select for diffs
Tue Nov 7 16:56:11 2006 UTC (5 years, 3 months ago) by kib
Branches: RELENG_6
CVS tags: RELENG_6_2_BP, RELENG_6_2_0_RELEASE, RELENG_6_2
Diff to: previous 1.103.2.20: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.20: +18 -6 lines
MFC
sys/ufs/ffs/ffs_inode.c rev. 1.107
sys/ufs/ffs/ffs_snapshot.c rev. 1.131
sys/ufs/ffs/ffs_vnops.c rev. 1.161
sys/ufs/ufs/inode.h rev. 1.51
sys/ufs/ufs/ufs_vnops.c rev. 1.280

Do not translate the IN_ACCESS inode flag into the IN_MODIFIED while filesystem
is suspending/suspended. Doing so may result in deadlock. Instead, set the
(new) IN_LAZYACCESS flag, that becomes IN_MODIFIED when suspend is lifted.

Change the locking protocol in order to set the IN_ACCESS and timestamps
without upgrading shared vnode lock to exclusive (see comments in the
inode.h). Before that, inode was modified while holding only shared
lock.

Tested on RELENG_6 by:	Peter Holm
Approved by:	re (kensmith)

Revision 1.131: download - view: text, markup, annotated - select for diffs
Tue Oct 10 09:20:54 2006 UTC (5 years, 4 months ago) by kib
Branches: MAIN
Diff to: previous 1.130: preferred, colored
Changes since revision 1.130: +18 -6 lines
Do not translate the IN_ACCESS inode flag into the IN_MODIFIED while filesystem
is suspending/suspended. Doing so may result in deadlock. Instead, set the
(new) IN_LAZYACCESS flag, that becomes IN_MODIFIED when suspend is lifted.

Change the locking protocol in order to set the IN_ACCESS and timestamps
without upgrading shared vnode lock to exclusive (see comments in the
inode.h). Before that, inode was modified while holding only shared
lock.

Tested by:	Peter Holm
Reviewed by:	tegge, bde
Approved by:	pjd (mentor)
MFC after:	3 weeks

Revision 1.103.2.20: download - view: text, markup, annotated - select for diffs
Mon Oct 9 19:55:19 2006 UTC (5 years, 4 months ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.19: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.19: +1 -1 lines
MFC: Don't restore MNT_QUOTA bit in mnt_flag after snapshot creation,
     closing a race between nmount() and quotactl().

Approved by:	re (kensmith)

Revision 1.103.2.19: download - view: text, markup, annotated - select for diffs
Mon Oct 9 19:47:17 2006 UTC (5 years, 4 months ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.18: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.18: +6 -1 lines
MFC: Use mount interlock to protect all changes to mnt_flag and
     mnt_kern_flag. This eliminates a race where MNT_UPDATE flag could be
     lost when nmount() raced against sync(), sync_fsync() or quotactl().

Approved by:	re (kensmith)

Revision 1.103.2.18: download - view: text, markup, annotated - select for diffs
Wed Sep 27 00:37:46 2006 UTC (5 years, 4 months ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.17: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.17: +1 -0 lines
MFC: Release references acquired by VOP_GETWRITEMOUNT() and vfs_getvfs().

Approved by:	re (kensmith)

Revision 1.130: download - view: text, markup, annotated - select for diffs
Tue Sep 26 04:19:11 2006 UTC (5 years, 4 months ago) by tegge
Branches: MAIN
Diff to: previous 1.129: preferred, colored
Changes since revision 1.129: +1 -1 lines
Don't restore MNT_QUOTA bit in mnt_flag after snapshot creation,
closing a race between nmount() and quotactl().

Revision 1.129: download - view: text, markup, annotated - select for diffs
Tue Sep 26 04:12:48 2006 UTC (5 years, 4 months ago) by tegge
Branches: MAIN
Diff to: previous 1.128: preferred, colored
Changes since revision 1.128: +6 -1 lines
Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag.
This eliminates a race where MNT_UPDATE flag could be lost when nmount()
raced against sync(), sync_fsync() or quotactl().

Revision 1.103.2.17: download - view: text, markup, annotated - select for diffs
Mon Sep 4 13:55:32 2006 UTC (5 years, 5 months ago) by kib
Branches: RELENG_6
Diff to: previous 1.103.2.16: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.16: +1 -1 lines
While checking for update of snapshot file in the ffs_copyonwrite,
first filter out metadata update. Otherwise, devfs vnode could be
erronously interpreted as ufs one, causing further check of i_flags
to use random memory.

PR:	kern/100365
Debugged and fix described by:	tegge
Approved by:	pjd (mentor)

Revision 1.103.2.16: download - view: text, markup, annotated - select for diffs
Mon Sep 4 10:05:25 2006 UTC (5 years, 5 months ago) by pjd
Branches: RELENG_6
Diff to: previous 1.103.2.15: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.15: +2 -5 lines
MFC:	sys/ufs/ffs/ffs_snapshot.c	1.121

- Set bio_done directly to NULL to indicate that we want to wait for the bio.
- Use biowait() instead of copying the code.

Revision 1.128: download - view: text, markup, annotated - select for diffs
Mon Aug 21 17:20:19 2006 UTC (5 years, 5 months ago) by kib
Branches: MAIN
Diff to: previous 1.127: preferred, colored
Changes since revision 1.127: +1 -1 lines
While checking for update of snapshot file in the ffs_copyonwrite,
first filter out metadata update. Otherwise, devfs vnode could be
erronously interpreted as ufs one, causing further check of i_flags
to use random memory.

PR:	kern/100365
Debugged and fix described by:	tegge
Approved by:	pjd (mentor)
MFC after:	2 weeks

Revision 1.103.2.15: download - view: text, markup, annotated - select for diffs
Wed May 24 20:20:16 2006 UTC (5 years, 8 months ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.14: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.14: +4 -0 lines
MFC: Read block hints list from last snapshot on the active snapshot list.

Revision 1.103.2.14: download - view: text, markup, annotated - select for diffs
Wed May 24 20:16:46 2006 UTC (5 years, 8 months ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.13: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.13: +11 -0 lines
MFC: Copy last block on file system again after file system has been
     suspended.

Obtained from:	NetBSD

Revision 1.103.2.13: download - view: text, markup, annotated - select for diffs
Wed May 24 20:13:36 2006 UTC (5 years, 8 months ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.12: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.12: +3 -2 lines
MFC: Don't leak a locked buffer if last block on file system cannot be read.

Revision 1.103.2.12: download - view: text, markup, annotated - select for diffs
Wed May 24 20:11:32 2006 UTC (5 years, 8 months ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.11: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.11: +4 -6 lines
MFC: Errors detected while file system is suspended should not trigger an
     assertion failure.

Revision 1.103.2.11: download - view: text, markup, annotated - select for diffs
Wed May 24 20:08:55 2006 UTC (5 years, 8 months ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.10: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.10: +9 -2 lines
MFC: Expunge traces of unlinked snapshot files when making a new snapshot.

Revision 1.127: download - view: text, markup, annotated - select for diffs
Tue May 16 00:14:20 2006 UTC (5 years, 8 months ago) by tegge
Branches: MAIN
Diff to: previous 1.126: preferred, colored
Changes since revision 1.126: +4 -0 lines
Read block hints list from last snapshot on the active snapshot list.

Revision 1.126: download - view: text, markup, annotated - select for diffs
Mon May 15 23:18:49 2006 UTC (5 years, 8 months ago) by tegge
Branches: MAIN
Diff to: previous 1.125: preferred, colored
Changes since revision 1.125: +11 -0 lines
Copy last block on file system again after file system has been suspended.

Obtained from:	NetBSD

Revision 1.125: download - view: text, markup, annotated - select for diffs
Mon May 15 22:59:23 2006 UTC (5 years, 8 months ago) by tegge
Branches: MAIN
Diff to: previous 1.124: preferred, colored
Changes since revision 1.124: +3 -2 lines
Don't leak a locked buffer if last block on file system cannot be read.

Revision 1.124: download - view: text, markup, annotated - select for diffs
Mon May 15 22:52:22 2006 UTC (5 years, 8 months ago) by tegge
Branches: MAIN
Diff to: previous 1.123: preferred, colored
Changes since revision 1.123: +4 -6 lines
Errors detected while file system is suspended should not trigger an
assertion failure.

Revision 1.103.2.10: download - view: text, markup, annotated - select for diffs
Sun May 14 00:23:27 2006 UTC (5 years, 9 months ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.9: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.9: +21 -0 lines
MFC: Turn off disk quotas for snapshot files.

Revision 1.103.2.9: download - view: text, markup, annotated - select for diffs
Sun May 14 00:02:48 2006 UTC (5 years, 9 months ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.8: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.8: +10 -0 lines
MFC: Detect the snapshot file being prematurely unlinked.

Revision 1.103.2.8: download - view: text, markup, annotated - select for diffs
Sat May 13 23:52:59 2006 UTC (5 years, 9 months ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.7: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.7: +10 -3 lines
MFC: A side effect of calling runningbufwakeup() is that
     bp->b_runningbufspace is cleared.  Save old value and restore
     bp->b_runningbufspace before returning from ffs_copyonwrite().

Revision 1.103.2.7: download - view: text, markup, annotated - select for diffs
Sat May 13 23:49:45 2006 UTC (5 years, 9 months ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.6: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.6: +66 -41 lines
MFC: Close a race when VOP_LOCK() on a snapshot file is attempted at the
     same time as it is changed back into a normal file.  The locker would
     get the shared "snaplk" lock which would no longer be the correct lock
     for the vnode.

Revision 1.103.2.6: download - view: text, markup, annotated - select for diffs
Sat May 13 23:40:44 2006 UTC (5 years, 9 months ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.5: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.5: +58 -0 lines
MFC: Add NO_FFS_SNAPSHOT kernel option.

Revision 1.123: download - view: text, markup, annotated - select for diffs
Sat May 13 20:41:37 2006 UTC (5 years, 9 months ago) by tegge
Branches: MAIN
Diff to: previous 1.122: preferred, colored
Changes since revision 1.122: +9 -2 lines
Expunge traces of unlinked snapshot files when making a new snapshot.

Revision 1.122: download - view: text, markup, annotated - select for diffs
Fri May 5 20:10:03 2006 UTC (5 years, 9 months ago) by tegge
Branches: MAIN
Diff to: previous 1.121: preferred, colored
Changes since revision 1.121: +21 -0 lines
Turn off disk quotas for snapshot files.

Revision 1.121: download - view: text, markup, annotated - select for diffs
Fri May 5 10:06:22 2006 UTC (5 years, 9 months ago) by pjd
Branches: MAIN
Diff to: previous 1.120: preferred, colored
Changes since revision 1.120: +2 -5 lines
- Set bio_done directly to NULL to indicate that we want to wait for the bio.
- Use biowait() instead of copying the code.

MFC after:	1 month

Revision 1.120: download - view: text, markup, annotated - select for diffs
Wed May 3 00:29:22 2006 UTC (5 years, 9 months ago) by tegge
Branches: MAIN
Diff to: previous 1.119: preferred, colored
Changes since revision 1.119: +10 -0 lines
Detect the snapshot file being prematurely unlinked.

Revision 1.119: download - view: text, markup, annotated - select for diffs
Wed May 3 00:04:38 2006 UTC (5 years, 9 months ago) by tegge
Branches: MAIN
Diff to: previous 1.118: preferred, colored
Changes since revision 1.118: +10 -3 lines
A side effect of calling runningbufwakeup() is that bp->b_runningbufspace is
cleared.  Save old value and restore bp->b_runningbufspace before returning
from ffs_copyonwrite().

Revision 1.118: download - view: text, markup, annotated - select for diffs
Tue May 2 23:52:43 2006 UTC (5 years, 9 months ago) by tegge
Branches: MAIN
Diff to: previous 1.117: preferred, colored
Changes since revision 1.117: +66 -41 lines
Close a race when VOP_LOCK() on a snapshot file is attempted at the
same time as it is changed back into a normal file.  The locker would
get the shared "snaplk" lock which would no longer be the correct lock
for the vnode.

Revision 1.117: download - view: text, markup, annotated - select for diffs
Fri Mar 31 03:54:20 2006 UTC (5 years, 10 months ago) by jeff
Branches: MAIN
Diff to: previous 1.116: preferred, colored
Changes since revision 1.116: +1 -0 lines
 - Release the references acquired by VOP_GETWRITEMOUNT and vfs_getvfs().

Discussed with:	tegge
Tested by:	kris
Sponsored by:	Isilon Systems, Inc.

Revision 1.103.2.5: download - view: text, markup, annotated - select for diffs
Wed Mar 22 17:42:31 2006 UTC (5 years, 10 months ago) by tegge
Branches: RELENG_6
CVS tags: RELENG_6_1_BP, RELENG_6_1_0_RELEASE, RELENG_6_1
Diff to: previous 1.103.2.4: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.4: +3 -4 lines
MFC: Ensure that vnode for directory isn't reclaimed before ffs_snapshot()
     has completed expunging unlinked files.  It could come back at another
     memory location causing a lock order reversal.

Approved by:	re (kensmith)

Revision 1.116: download - view: text, markup, annotated - select for diffs
Sun Mar 19 21:05:10 2006 UTC (5 years, 10 months ago) by tegge
Branches: MAIN
Diff to: previous 1.115: preferred, colored
Changes since revision 1.115: +3 -4 lines
Ensure that vnode for directory isn't reclaimed before ffs_snapshot() has
completed expunging unlinked files.  It could come back at another memory
location causing a lock order reversal.

Revision 1.103.2.4: download - view: text, markup, annotated - select for diffs
Mon Mar 13 03:07:42 2006 UTC (5 years, 11 months ago) by jeff
Branches: RELENG_6
Diff to: previous 1.103.2.3: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.3: +88 -1 lines
MFC Revs 1.115, 1.114, 1.113
VFS SMP fixes, stack api, softupdates fixes.

Sponsored by:	Isilon Systems, Inc.
Approved by:	re (scottl)

Revision 1.115: download - view: text, markup, annotated - select for diffs
Sun Mar 12 05:26:12 2006 UTC (5 years, 11 months ago) by jeff
Branches: MAIN
Diff to: previous 1.114: preferred, colored
Changes since revision 1.114: +0 -7 lines
 - Remove the call to softdep_waitidle after suspending the filesystem.
   This does not do what I wanted as all dirty buffers must be flushed
   by the call to ffs_sync and any remaining dependency work would mean
   that this failed.

Pointed out by: tegge

Revision 1.114: download - view: text, markup, annotated - select for diffs
Sat Mar 11 01:08:36 2006 UTC (5 years, 11 months ago) by tegge
Branches: MAIN
Diff to: previous 1.113: preferred, colored
Changes since revision 1.113: +88 -1 lines
Block secondary writes while expunging active unlinked files.

Fix detection of active unlinked files by checking VI_OWEINACT and
VI_DOINGINACT in addition to v_usecount.

Defer inactive handling for unlinked files if the file system is mostly
suspended (secondary writes being blocked).

Perform deferred inactive handling after the file system is resumed.

Revision 1.113: download - view: text, markup, annotated - select for diffs
Thu Mar 2 05:50:23 2006 UTC (5 years, 11 months ago) by jeff
Branches: MAIN
Diff to: previous 1.112: preferred, colored
Changes since revision 1.112: +7 -0 lines
 - Move softdep from using a global worklist to per-mount worklists.  This
   has many positive effects including improved smp locking, reducing
   interdependencies between mounts that can lead to deadlocks, etc.
 - Add the softdep worklist and various counters to the ufsmnt structure.
 - Add a mount pointer to the workitem and remove mount pointers from the
   various structures derived from the workitem as they are now redundant.
 - Remove the poor-man's semaphore protecting softdep_process_worklist and
   softdep_flushworklist.  Several threads may now process the list
   simultaneously.
 - Add softdep_waitidle() to block the thread until all pending
   dependencies being operated on by other threads have been flushed.
 - Use softdep_waitidle() in unmount and snapshots to block either
   operation until the fs is stable.
 - Remove softdep worklist processing from the syncer and move it into the
   softdep_flush() thread.  This thread processes all softdep mounts
   once each second and when it is called via the new softdep_speedup()
   when there is a resource shortage.  This removes the softdep hook
   from the kernel and various hacks in header files to support it.

Reviewed by/Discussed with:	tegge, truckman, mckusick
Tested by:	kris

Revision 1.103.2.3: download - view: text, markup, annotated - select for diffs
Sat Jan 14 01:18:03 2006 UTC (6 years ago) by tegge
Branches: RELENG_6
Diff to: previous 1.103.2.2: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.2: +4 -2 lines
MFC: Add marker vnodes to ensure that all vnodes associated with the mount
     point are iterated over when using MNT_VNODE_FOREACH.

Revision 1.112: download - view: text, markup, annotated - select for diffs
Mon Jan 9 20:42:18 2006 UTC (6 years, 1 month ago) by tegge
Branches: MAIN
Diff to: previous 1.111: preferred, colored
Changes since revision 1.111: +4 -2 lines
Add marker vnodes to ensure that all vnodes associated with the mount point are
iterated over when using MNT_VNODE_FOREACH.

Reviewed by:	truckman

Revision 1.111: download - view: text, markup, annotated - select for diffs
Fri Jan 6 04:44:09 2006 UTC (6 years, 1 month ago) by imp
Branches: MAIN
Diff to: previous 1.110: preferred, colored
Changes since revision 1.110: +58 -0 lines
New option: NO_FFS_SNAPSHOT.  I did this in p4 about the same time
that NetBSD implemented it independently of them (don't know which one
was actually first).  This saves about 24k for those times you don't
need snapshot support (like when running off a ram disk, or in an
embedded environment where size matters).

Revision 1.103.2.1.2.1: download - view: text, markup, annotated - select for diffs
Sat Oct 29 07:00:45 2005 UTC (6 years, 3 months ago) by scottl
Branches: RELENG_6_0
CVS tags: RELENG_6_0_0_RELEASE
Diff to: previous 1.103.2.1: preferred, colored; next MAIN 1.103.2.2: preferred, colored
Changes since revision 1.103.2.1: +50 -58 lines
Sync RELENG_6_0 with all of the FFS fixes from Tor.

Submitted by:	tegge
Approved by:	re

Revision 1.103.2.2: download - view: text, markup, annotated - select for diffs
Sat Oct 29 06:40:41 2005 UTC (6 years, 3 months ago) by scottl
Branches: RELENG_6
Diff to: previous 1.103.2.1: preferred, colored; branchpoint 1.103: preferred, colored
Changes since revision 1.103.2.1: +50 -58 lines
MFC rev 1.106 - 1.110

Submitted by: tegge
Approved by: re

Revision 1.110: download - view: text, markup, annotated - select for diffs
Sun Oct 9 20:15:15 2005 UTC (6 years, 4 months ago) by tegge
Branches: MAIN
Diff to: previous 1.109: preferred, colored
Changes since revision 1.109: +11 -0 lines
Reduce probability for a deadlock that can occur when a snapshot inode is
updated by a process holding the snapshot lock.  Another process updating a
different inode in the same inodeblock will do copy on write checks and lock in
the opposite direction.

The snapshot code force a copy on write of these blocks manually (cf. start of
expunge_ufs[12]) and these inode blocks are later put on snapblklist.

This partial fix is to 'drain' the relevant ffs_copyonwrite() operation after
installing new snapblklist.  This is not a 100% solution since a failed block
allocation can cause implicit fsync() which might deadlock before the new
snapblklist has been installed.

Revision 1.109: download - view: text, markup, annotated - select for diffs
Sun Oct 9 20:07:51 2005 UTC (6 years, 4 months ago) by tegge
Branches: MAIN
Diff to: previous 1.108: preferred, colored
Changes since revision 1.108: +2 -0 lines
Eliminate a deadlock that can occur when a dirty block belonging to a snapshot
file is flushed by a process not holding snaplk (e.g. bufdaemon).  Another
process might hold snaplk and try to access the block due to ffs_copyonwrite
processing.

Revision 1.108: download - view: text, markup, annotated - select for diffs
Sun Oct 9 20:00:16 2005 UTC (6 years, 4 months ago) by tegge
Branches: MAIN
Diff to: previous 1.107: preferred, colored
Changes since revision 1.107: +2 -3 lines
Eliminate a deadlock that can occur during the cgaccount() processing due to
the cg map buffer being held when writing indirect blocks.  The process ends up
in ffs_copyonwrite(), attempting to get snaplk while holding the cg map buffer
lock.

Another process might be in ffs_copyonwrite(), trying to allocate a new block
for a copy.  It would hold snaplk while trying to get the cg map buffer lock.

Release the cg map buffer early and use the copy for most of the cgaccount
processing to avoid this deadlock.

Revision 1.107: download - view: text, markup, annotated - select for diffs
Sun Oct 9 19:45:01 2005 UTC (6 years, 4 months ago) by tegge
Branches: MAIN
Diff to: previous 1.106: preferred, colored
Changes since revision 1.106: +35 -55 lines
Reduce the probability of low block numbers passed to ffs_snapblkfree() by
skipping the call from ffs_snapremove() if the block number is zero.

Simplify snapshot locking in ffs_copyonwrite() and ffs_snapblkfree() by using
the same locking protocol for low block numbers as for larger block numbers.
This removes a lock leak that could happen if vn_lock() succeeded after
lockmgr() failed in ffs_snapblkfree().

Check if snapshot is gone before retrying a lock in ffs_copyonwrite().

Revision 1.103.2.1: download - view: text, markup, annotated - select for diffs
Tue Oct 4 04:41:27 2005 UTC (6 years, 4 months ago) by truckman
Branches: RELENG_6
CVS tags: RELENG_6_0_BP
Branch point for: RELENG_6_0
Diff to: previous 1.103: preferred, colored
Changes since revision 1.103: +28 -4 lines
MFC snaplk deadlock fix
        src/sys/kern/vfs_bio.c          1.495, 1.496
        src/sys/kern/vfs_subr.c         1.648
        src/sys/sys/buf.h               1.190, 1.191
        src/sys/sys/proc.h              1.436
        src/sys/ufs/ffs/ffs_snapshot.c  1.104, 1.105, 1.106

Original commit messages:

    Log:
    Un-staticize runningbufwakeup() and staticize updateproc.

    Add a new private thread flag to indicate that the thread should
    not sleep if runningbufspace is too large.

    Set this flag on the bufdaemon and syncer threads so that they skip
    the waitrunningbufspace() call in bufwrite() rather than than
    checking the proc pointer vs. the known proc pointers for these two
    threads.  A way of preventing these threads from being starved for
    I/O but still placing limits on their outstanding I/O would be
    desirable.

    Set this flag in ffs_copyonwrite() to prevent bufwrite() calls from
    blocking on the runningbufspace check while holding snaplk.  This
    prevents snaplk from being held for an arbitrarily long period of
    time if runningbufspace is high and greatly reduces the contention
    for snaplk.  The disadvantage is that ffs_copyonwrite() can start
    a large amount of I/O if there are a large number of snapshots,
    which could cause a deadlock in other parts of the code.

    Call runningbufwakeup() in ffs_copyonwrite() to decrement runningbufspace
    before attempting to grab snaplk so that I/O requests waiting on
    snaplk are not counted in runningbufspace as being in-progress.
    Increment runningbufspace again before actually launching the
    original I/O request.

    Prior to the above two changes, the system could deadlock if enough
    I/O requests were blocked by snaplk to prevent runningbufspace from
    falling below lorunningspace and one of the bawrite() calls in
    ffs_copyonwrite() blocked in waitrunningbufspace() while holding
    snaplk.

    See <http://www.holm.cc/stress/log/cons143.html>

    Revision  Changes    Path
    1.495     +3 -3      src/sys/kern/vfs_bio.c
    1.648     +2 -1      src/sys/kern/vfs_subr.c
    1.190     +1 -0      src/sys/sys/buf.h
    1.436     +1 -1      src/sys/sys/proc.h
    1.104     +16 -4     src/sys/ufs/ffs/ffs_snapshot.c

    Log:
    Un-staticize waitrunningbufspace() and call it before returning from
    ffs_copyonwrite() if any async writes were launched.

    Restore the threads previous TDP_NORUNNINGBUF state before returning
    from ffs_copyonwrite().

    Revision  Changes    Path
    1.496     +1 -1      src/sys/kern/vfs_bio.c
    1.191     +1 -0      src/sys/sys/buf.h
    1.105     +13 -1     src/sys/ufs/ffs/ffs_snapshot.c

    Log:
    Correct previous commit to fix the sense of the TDP_NORUNNINGBUF
    check in ffs_copyonwrite() that is a precondition for calling
    waitrunningbufspace().

    Pointed out by: tegge
    Pointy hat to:  truckman
    MFC after:      3 days

    Revision  Changes    Path
    1.106     +1 -1      src/sys/ufs/ffs/ffs_snapshot.c

Approved by:	re (scottl)

Revision 1.106: download - view: text, markup, annotated - select for diffs
Sat Oct 1 19:10:48 2005 UTC (6 years, 4 months ago) by truckman
Branches: MAIN
Diff to: previous 1.105: preferred, colored
Changes since revision 1.105: +1 -1 lines
Correct previous commit to fix the sense of the TDP_NORUNNINGBUF
check in ffs_copyonwrite() that is a precondition for calling
waitrunningbufspace().

Pointed out by:	tegge
Pointy hat to:	truckman
MFC after:	3 days

Revision 1.105: download - view: text, markup, annotated - select for diffs
Fri Sep 30 18:07:41 2005 UTC (6 years, 4 months ago) by truckman
Branches: MAIN
Diff to: previous 1.104: preferred, colored
Changes since revision 1.104: +13 -1 lines
Un-staticize waitrunningbufspace() and call it before returning from
ffs_copyonwrite() if any async writes were launched.

Restore the threads previous TDP_NORUNNINGBUF state before returning
from ffs_copyonwrite().

Revision 1.104: download - view: text, markup, annotated - select for diffs
Fri Sep 30 01:30:01 2005 UTC (6 years, 4 months ago) by truckman
Branches: MAIN
Diff to: previous 1.103: preferred, colored
Changes since revision 1.103: +16 -4 lines
Un-staticize runningbufwakeup() and staticize updateproc.

Add a new private thread flag to indicate that the thread should
not sleep if runningbufspace is too large.

Set this flag on the bufdaemon and syncer threads so that they skip
the waitrunningbufspace() call in bufwrite() rather than than
checking the proc pointer vs. the known proc pointers for these two
threads.  A way of preventing these threads from being starved for
I/O but still placing limits on their outstanding I/O would be
desirable.

Set this flag in ffs_copyonwrite() to prevent bufwrite() calls from
blocking on the runningbufspace check while holding snaplk.  This
prevents snaplk from being held for an arbitrarily long period of
time if runningbufspace is high and greatly reduces the contention
for snaplk.  The disadvantage is that ffs_copyonwrite() can start
a large amount of I/O if there are a large number of snapshots,
which could cause a deadlock in other parts of the code.

Call runningbufwakeup() in ffs_copyonwrite() to decrement runningbufspace
before attempting to grab snaplk so that I/O requests waiting on
snaplk are not counted in runningbufspace as being in-progress.
Increment runningbufspace again before actually launching the
original I/O request.

Prior to the above two changes, the system could deadlock if enough
I/O requests were blocked by snaplk to prevent runningbufspace from
falling below lorunningspace and one of the bawrite() calls in
ffs_copyonwrite() blocked in waitrunningbufspace() while holding
snaplk.

See <http://www.holm.cc/stress/log/cons143.html>

Revision 1.103: download - view: text, markup, annotated - select for diffs
Sun Apr 3 12:03:44 2005 UTC (6 years, 10 months ago) by jeff
Branches: MAIN
CVS tags: RELENG_6_BP
Branch point for: RELENG_6
Diff to: previous 1.102: preferred, colored
Changes since revision 1.102: +13 -13 lines
 - Use M_ZERO rather than explicitly calling bzero().
 - Don't intermingle direct calls to lockmgr and indirect calls through
   VOPs.  This will be important in the future.
 - Dont lock the devvp's interlock just to release it on the next line by
   passing LK_INTERLOCK to lockmgr.
 - Restructure ffs_snapshot_unmount so we don't call free() with the
   devvp's interlock locked.

Revision 1.102: download - view: text, markup, annotated - select for diffs
Thu Mar 31 05:21:17 2005 UTC (6 years, 10 months ago) by jeff
Branches: MAIN
Diff to: previous 1.101: preferred, colored
Changes since revision 1.101: +2 -2 lines
 - Set LK_NOSHARE for snapshot locks.  snapshots require exclusive only
   access.
 - Remove the hack from ffs_lock() to implement LK_NOSHARE in a ffs
   specific way.

Sponsored by:	Isilon Systems, Inc.

Revision 1.101: download - view: text, markup, annotated - select for diffs
Thu Mar 31 04:34:30 2005 UTC (6 years, 10 months ago) by jeff
Branches: MAIN
Diff to: previous 1.100: preferred, colored
Changes since revision 1.100: +2 -2 lines
 - LK_NOPAUSE is a nop now.

Sponsored by:   Isilon Systems, Inc.

Revision 1.84.2.3: download - view: text, markup, annotated - select for diffs
Wed Mar 16 13:39:47 2005 UTC (6 years, 10 months ago) by pb
Branches: RELENG_5
CVS tags: RELENG_5_5_BP, RELENG_5_5_0_RELEASE, RELENG_5_5, RELENG_5_4_BP, RELENG_5_4_0_RELEASE, RELENG_5_4
Diff to: previous 1.84.2.2: preferred, colored; branchpoint 1.84: preferred, colored; next MAIN 1.85: preferred, colored
Changes since revision 1.84.2.2: +1 -1 lines
MFC rev 1.91:

Fixes a bug that caused UFS2 filesystems bigger than 2TB to
prematurely report that they were full and/or to panic the kernel
with the message ``ffs_clusteralloc: allocated out of group''.

Approved by:	re (kensmith), mckusick

Revision 1.100: download - view: text, markup, annotated - select for diffs
Sun Mar 13 12:01:50 2005 UTC (6 years, 11 months ago) by jeff
Branches: MAIN
Diff to: previous 1.99: preferred, colored
Changes since revision 1.99: +1 -1 lines
 - The VI_DOOMED flag now signals the end of a vnode's relationship with
   the filesystem.  Check that rather than VI_XLOCK.

Sponsored by:	Isilon Systems, Inc.

Revision 1.99: download - view: text, markup, annotated - select for diffs
Tue Mar 1 07:38:45 2005 UTC (6 years, 11 months ago) by jeff
Branches: MAIN
Diff to: previous 1.98: preferred, colored
Changes since revision 1.98: +1 -1 lines
 - Fix anoter dyslexic moment; an atomic_set_int should've become ACTIVESET,
   not ACTIVECLEAR.

Submitted by:	iedowse

Revision 1.84.2.2: download - view: text, markup, annotated - select for diffs
Sun Feb 27 18:01:40 2005 UTC (6 years, 11 months ago) by delphij
Branches: RELENG_5
Diff to: previous 1.84.2.1: preferred, colored; branchpoint 1.84: preferred, colored
Changes since revision 1.84.2.1: +1 -1 lines
MFC revision 1.98
date: 2005/02/19 07:31:33;  author: delphij;  state: Exp;  lines: +1 -1
When clearing a fragment, it's possible that the length is zero.

Reviewed by:	mckusick
MFC After:	1 week

Revision 1.98: download - view: text, markup, annotated - select for diffs
Sat Feb 19 07:31:33 2005 UTC (6 years, 11 months ago) by delphij
Branches: MAIN
Diff to: previous 1.97: preferred, colored
Changes since revision 1.97: +1 -1 lines
When clearing a fragment, it's possible that the length is zero.

Reviewed by:	mckusick
MFC After:	1 week

Revision 1.97: download - view: text, markup, annotated - select for diffs
Tue Feb 8 17:40:01 2005 UTC (7 years ago) by phk
Branches: MAIN
Diff to: previous 1.96: preferred, colored
Changes since revision 1.96: +10 -10 lines
Don't use the UFS_* and VFS_* functions where a direct call is possble.

The UFS_ functions are for UFS to call back into VFS.  The VFS functions
are external entry points into the filesystem.

Revision 1.96: download - view: text, markup, annotated - select for diffs
Tue Feb 8 17:23:39 2005 UTC (7 years ago) by phk
Branches: MAIN
Diff to: previous 1.95: preferred, colored
Changes since revision 1.95: +0 -0 lines
(forced commit to record correct commit message)

Split ffs_fsync() into a VOP_FSYNC() component and an internal part
called ffs_syncvnode().

Eliminate unnecessary thread argument and XXX'ed curthread passes
for same.  Reduce softdep_sync_metadata() from a struct vop_fsync_args
to just the vnode argument it needs.

Convert internal VOP_FSYNC() calls to use ffs_syncvnode().

Revision 1.95: download - view: text, markup, annotated - select for diffs
Tue Feb 8 16:25:50 2005 UTC (7 years ago) by phk
Branches: MAIN
Diff to: previous 1.94: preferred, colored
Changes since revision 1.94: +9 -9 lines
For snapshots we need all VOP_LOCKs to be exclusive.

The "business class upgrade" was implemented in UFS's VOP_LOCK
implementation ufs_lock() which is the wrong layer, so move it to
ffs_lock().

Also, as long as we have not abandonned advanced vfs-stacking we
should not preclude it from happening: instead of implementing a
copy locally, use the VOP_LOCK_APV(&ufs) to correctly arrive at
vop_stdlock() at the bottom.

Revision 1.84.2.1: download - view: text, markup, annotated - select for diffs
Mon Jan 31 23:26:59 2005 UTC (7 years ago) by imp
Branches: RELENG_5
Diff to: previous 1.84: preferred, colored
Changes since revision 1.84: +1 -1 lines
MFC: /*- and related license changes

Revision 1.94: download - view: text, markup, annotated - select for diffs
Mon Jan 24 10:10:11 2005 UTC (7 years ago) by jeff
Branches: MAIN
Diff to: previous 1.93: preferred, colored
Changes since revision 1.93: +26 -10 lines
 - Use the ufs lock to protect fs_active.

Sponsored By:	Isilon Systems, Inc.

Revision 1.93: download - view: text, markup, annotated - select for diffs
Tue Jan 11 07:36:21 2005 UTC (7 years, 1 month ago) by phk
Branches: MAIN
Diff to: previous 1.92: preferred, colored
Changes since revision 1.92: +9 -9 lines
Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC().

I'm not sure why a credential was added to these in the first place, it is
not used anywhere and it doesn't make much sense:

	The credentials for syncing a file (ability to write to the
	file) should be checked at the system call level.

	Credentials for syncing one or more filesystems ("none")
	should be checked at the system call level as well.

	If the filesystem implementation needs a particular credential
	to carry out the syncing it would logically have to the
	cached mount credential, or a credential cached along with
	any delayed write data.

Discussed with:	rwatson

Revision 1.92: download - view: text, markup, annotated - select for diffs
Fri Jan 7 02:29:25 2005 UTC (7 years, 1 month ago) by imp
Branches: MAIN
Diff to: previous 1.91: preferred, colored
Changes since revision 1.91: +1 -1 lines
/* -> /*- for license, minor formatting changes

Revision 1.91: download - view: text, markup, annotated - select for diffs
Thu Dec 9 21:24:00 2004 UTC (7 years, 2 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.90: preferred, colored
Changes since revision 1.90: +1 -1 lines
Fixes a bug that caused UFS2 filesystems bigger than 2TB to
prematurely report that they were full and/or to panic the kernel
with the message ``ffs_clusteralloc: allocated out of group''.

Submitted by:	Henry Whincup <henry@jot.to>
MFC after:	1 week

Revision 1.90: download - view: text, markup, annotated - select for diffs
Wed Dec 8 11:54:06 2004 UTC (7 years, 2 months ago) by phk
Branches: MAIN
Diff to: previous 1.89: preferred, colored
Changes since revision 1.89: +1 -1 lines
Fix snapshot creation.

Revision 1.89: download - view: text, markup, annotated - select for diffs
Fri Oct 29 10:15:55 2004 UTC (7 years, 3 months ago) by phk
Branches: MAIN
Diff to: previous 1.88: preferred, colored
Changes since revision 1.88: +17 -13 lines
Move UFS from DEVFS backing to GEOM backing.

This eliminates a bunch of vnode overhead (approx 1-2 % speed
improvement) and gives us more control over the access to the storage
device.

Access counts on the underlying device are not correctly tracked and
therefore it is possible to read-only mount the same disk device multiple
times:
	syv# mount -p
	/dev/md0        /var    ufs rw  2 2
	/dev/ad0        /mnt    ufs ro  1 1
	/dev/ad0        /mnt2   ufs ro  1 1
	/dev/ad0        /mnt3   ufs ro  1 1

Since UFS/FFS is not a synchrousely consistent filesystem (ie: it caches
things in RAM) this is not possible with read-write mounts, and the system
will correctly reject this.

Details:

	Add a geom consumer and a bufobj pointer to ufsmount.

	Eliminate the vnode argument from softdep_disk_prewrite().
	Pick the vnode out of bp->b_vp for now.  Eventually we
	should find it through bp->b_bufobj->b_private.

	In the mountcode, use g_vfs_open() once we have used
	VOP_ACCESS() to check permissions.

	When upgrading and downgrading between r/o and r/w do the
	right thing with GEOM access counts.  Remove all the
	workarounds for not being able to do this with VOP_OPEN().

	If we are the root mount, drop the exclusive access count
	until we upgrade to r/w.  This allows fsck of the root
	filesystem and the MNT_RELOAD to work correctly.

	Set bo_private to the GEOM consumer on the device bufobj.

	Change the ffs_ops->strategy function to call g_vfs_strategy()

	In ufs_strategy() directly call the strategy on the disk
	bufobj.  Same in rawread.

	In ffs_fsync() we will no longer see VCHR device nodes, so
	remove code which synced the filesystem mounted on it, in
	case we came there.  I'm not sure this code made sense in
	the first place since we would have taken the specfs route
	on such a vnode.

	Redo the highly bogus readblock() function in the snapshot
	code to something slightly less bogus: Constructing an uio
	and using physio was really quite a detour.  Instead just
	fill in a bio and ship it down.

Revision 1.88: download - view: text, markup, annotated - select for diffs
Tue Oct 26 06:25:56 2004 UTC (7 years, 3 months ago) by phk
Branches: MAIN
Diff to: previous 1.87: preferred, colored
Changes since revision 1.87: +1 -6 lines
Degeneralize the per cdev copyonwrite callback.  The only possible value
is ffs_copyonwrite() and the only place it can be called from is FFS which
would never want to call another filesystems copyonwrite method, should one
exist, so there is no reason why anything generic should know about this.

Revision 1.87: download - view: text, markup, annotated - select for diffs
Thu Sep 16 17:28:56 2004 UTC (7 years, 4 months ago) by phk
Branches: MAIN
Diff to: previous 1.86: preferred, colored
Changes since revision 1.86: +4 -0 lines
Do not traverse list of snapshots if there isn't one.

Found by:	scottl

Revision 1.86: download - view: text, markup, annotated - select for diffs
Thu Sep 16 15:58:18 2004 UTC (7 years, 4 months ago) by phk
Branches: MAIN
Diff to: previous 1.85: preferred, colored
Changes since revision 1.85: +8 -11 lines
Missed a place where snapshots were allocated in my last commit to
this file.

Revision 1.85: download - view: text, markup, annotated - select for diffs
Mon Sep 13 07:29:45 2004 UTC (7 years, 4 months ago) by phk
Branches: MAIN
Diff to: previous 1.84: preferred, colored
Changes since revision 1.84: +88 -70 lines
Create struct snapdata which contains the snapshot fields from cdev
and the previously malloc'ed snapshot lock.

Malloc struct snapdata instead of just the lock.

Replace snapshot fields in cdev with pointer to snapdata (saves 16 bytes).

While here, give the private readblock() function a vnode argument
in preparation for moving UFS to access GEOM directly.

Revision 1.84: download - view: text, markup, annotated - select for diffs
Wed Jul 28 06:41:27 2004 UTC (7 years, 6 months ago) by kan
Branches: MAIN
CVS tags: RELENG_5_BP, RELENG_5_3_BP, RELENG_5_3_0_RELEASE, RELENG_5_3
Branch point for: RELENG_5
Diff to: previous 1.83: preferred, colored
Changes since revision 1.83: +14 -13 lines
Avoid using casts as lvalues. Introduce DIP_SET macro which sets proper
inode field based on UFS version. Use DIP ro read values and DIP_SET
to modify them throughout FFS code base.

Revision 1.83: download - view: text, markup, annotated - select for diffs
Sun Jul 4 08:52:35 2004 UTC (7 years, 7 months ago) by phk
Branches: MAIN
Diff to: previous 1.82: preferred, colored
Changes since revision 1.82: +1 -8 lines
When we traverse the vnodes on a mountpoint we need to look out for
our cached 'next vnode' being removed from this mountpoint.  If we
find that it was recycled, we restart our traversal from the start
of the list.

Code to do that is in all local disk filesystems (and a few other
places) and looks roughly like this:

		MNT_ILOCK(mp);
	loop:
		for (vp = TAILQ_FIRST(&mp...);
		    (vp = nvp) != NULL;
		    nvp = TAILQ_NEXT(vp,...)) {
			if (vp->v_mount != mp)
				goto loop;
			MNT_IUNLOCK(mp);
			...
			MNT_ILOCK(mp);
		}
		MNT_IUNLOCK(mp);

The code which takes vnodes off a mountpoint looks like this:

	MNT_ILOCK(vp->v_mount);
	...
	TAILQ_REMOVE(&vp->v_mount->mnt_nvnodelist, vp, v_nmntvnodes);
	...
	MNT_IUNLOCK(vp->v_mount);
	...
	vp->v_mount = something;

(Take a moment and try to spot the locking error before you read on.)

On a SMP system, one CPU could have removed nvp from our mountlist
but not yet gotten to assign a new value to vp->v_mount while another
CPU simultaneously get to the top of the traversal loop where it
finds that (vp->v_mount != mp) is not true despite the fact that
the vnode has indeed been removed from our mountpoint.

Fix:

Introduce the macro MNT_VNODE_FOREACH() to traverse the list of
vnodes on a mountpoint while taking into account that vnodes may
be removed from the list as we go.  This saves approx 65 lines of
duplicated code.

Split the insmntque() which potentially moves a vnode from one mount
point to another into delmntque() and insmntque() which does just
what the names say.

Fix delmntque() to set vp->v_mount to NULL while holding the
mountpoint lock.

Revision 1.82: download - view: text, markup, annotated - select for diffs
Fri Jun 18 14:35:17 2004 UTC (7 years, 7 months ago) by kuriyama
Branches: MAIN
Diff to: previous 1.81: preferred, colored
Changes since revision 1.81: +9 -0 lines
Avoid deadlock which is caused by locking VDIR of parent and VREG of
snapshot itself in wrong order.
We can skip unlink check of that directory because it must have
snapshot in it.

Reviewed by:	mckusick and current@

Revision 1.81: download - view: text, markup, annotated - select for diffs
Wed Jun 16 00:26:30 2004 UTC (7 years, 7 months ago) by julian
Branches: MAIN
Diff to: previous 1.80: preferred, colored
Changes since revision 1.80: +4 -4 lines
Nice, is a property of a process as a whole..
I mistakenly moved it to the ksegroup when breaking up the process
structure. Put it back in the proc structure.

Revision 1.80: download - view: text, markup, annotated - select for diffs
Tue Jun 8 13:08:18 2004 UTC (7 years, 8 months ago) by stefanf
Branches: MAIN
Diff to: previous 1.79: preferred, colored
Changes since revision 1.79: +2 -2 lines
Avoid assignments to cast expressions.

Reviewed by:	md5
Approved by:	das (mentor)

Revision 1.79: download - view: text, markup, annotated - select for diffs
Fri Feb 13 02:02:06 2004 UTC (8 years ago) by kuriyama
Branches: MAIN
Diff to: previous 1.78: preferred, colored
Changes since revision 1.78: +6 -2 lines
Fix style bugs in previous commit.

Submitted by:	bde

Revision 1.78: download - view: text, markup, annotated - select for diffs
Thu Feb 12 08:52:08 2004 UTC (8 years ago) by kuriyama
Branches: MAIN
Diff to: previous 1.77: preferred, colored
Changes since revision 1.77: +6 -4 lines
Reverse lock order by using local variable.  This will shut up "acquiring
duplicate lock of same type" message.

Reviewed by:	mckusick

Revision 1.77: download - view: text, markup, annotated - select for diffs
Sun Jan 4 04:08:34 2004 UTC (8 years, 1 month ago) by kan
Branches: MAIN
Diff to: previous 1.76: preferred, colored
Changes since revision 1.76: +2 -2 lines
Avoid calling vprint on a vnode while holding its interlock mutex.
Move diagnostic printf after vget. This might delay the debug
output some, but at least it keeps kernel from exploding if
DEBUG_VFS_LOCKS is in effect.

Revision 1.76: download - view: text, markup, annotated - select for diffs
Thu Nov 13 03:56:32 2003 UTC (8 years, 3 months ago) by alc
Branches: MAIN
CVS tags: RELENG_5_2_BP, RELENG_5_2_1_RELEASE, RELENG_5_2_0_RELEASE, RELENG_5_2
Diff to: previous 1.75: preferred, colored
Changes since revision 1.75: +1 -1 lines
Call free(9) after the vnode interlock is released, avoiding a lock-order
reversal.

Revision 1.75: download - view: text, markup, annotated - select for diffs
Wed Nov 5 04:30:08 2003 UTC (8 years, 3 months ago) by kan
Branches: MAIN
Diff to: previous 1.74: preferred, colored
Changes since revision 1.74: +8 -8 lines
Remove mntvnode_mtx and replace it with per-mountpoint mutex.
Introduce two new macros MNT_ILOCK(mp)/MNT_IUNLOCK(mp) to
operate on this mutex transparently.

Eventually new mutex will be protecting more fields in
struct mount, not only vnode list.

Discussed with: jeff

Revision 1.74: download - view: text, markup, annotated - select for diffs
Thu Oct 23 21:14:08 2003 UTC (8 years, 3 months ago) by jhb
Branches: MAIN
Diff to: previous 1.73: preferred, colored
Changes since revision 1.73: +13 -13 lines
Move the P_COWINPROGRESS flag from being a per-process p_flag to being a
per-thread td_pflag which doesn't require any locks to read or write as it
is only read or written by curthread on itself.

Glanced at by:	mckusick

Revision 1.73: download - view: text, markup, annotated - select for diffs
Fri Oct 17 13:57:58 2003 UTC (8 years, 3 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.72: preferred, colored
Changes since revision 1.72: +16 -6 lines
When expunging unlinked files from a snapshot, skip over holes in the
file rather than panicing with "indiracct: botched params".

Submitted by:	Mark Santcroos <marks@ripe.net>

Revision 1.72: download - view: text, markup, annotated - select for diffs
Sun Oct 5 06:48:37 2003 UTC (8 years, 4 months ago) by jeff
Branches: MAIN
Diff to: previous 1.71: preferred, colored
Changes since revision 1.71: +2 -1 lines
 - Skip over xvp if XLOCK is set.

Revision 1.71: download - view: text, markup, annotated - select for diffs
Sat Oct 4 14:25:45 2003 UTC (8 years, 4 months ago) by jeff
Branches: MAIN
Diff to: previous 1.70: preferred, colored
Changes since revision 1.70: +14 -6 lines
 - Fix an unlocked call to GETATTR by slightly shuffling the code in
   ffs_snapshot() around.
 - Acquire the interlock before releasing the mntvnode_mtx.  Use the
   interlock to protect v_usecount access.

Revision 1.70: download - view: text, markup, annotated - select for diffs
Wed Jun 11 06:31:28 2003 UTC (8 years, 8 months ago) by obrien
Branches: MAIN
Diff to: previous 1.69: preferred, colored
Changes since revision 1.69: +3 -1 lines
Use __FBSDID().

Revision 1.69: download - view: text, markup, annotated - select for diffs
Wed Apr 30 12:57:40 2003 UTC (8 years, 9 months ago) by markm
Branches: MAIN
CVS tags: RELENG_5_1_BP, RELENG_5_1_0_RELEASE, RELENG_5_1
Diff to: previous 1.68: preferred, colored
Changes since revision 1.68: +1 -1 lines
Fix some easy, global, lint warnings. In most cases, this means
making some local variables static. In a couple of cases, this means
removing an unused variable.

Revision 1.68: download - view: text, markup, annotated - select for diffs
Tue Apr 22 20:45:38 2003 UTC (8 years, 9 months ago) by jhb
Branches: MAIN
Diff to: previous 1.67: preferred, colored
Changes since revision 1.67: +10 -1 lines
Lock both the proc lock and sched_lock when calling sched_nice since
kg_nice is now protected by both.  Being protected by both means that
other places in the kernel that want to read kg_nice only need one of the
two locks.

Revision 1.67: download - view: text, markup, annotated - select for diffs
Sat Apr 12 01:05:19 2003 UTC (8 years, 10 months ago) by jeff
Branches: MAIN
Diff to: previous 1.66: preferred, colored
Changes since revision 1.66: +3 -2 lines
 - Use the sched_nice() api instead of setting the nice value directly.

Tested by:	Steve Kargl <sgk@troutmask.apl.washington.edu>

Revision 1.66: download - view: text, markup, annotated - select for diffs
Thu Mar 20 21:17:40 2003 UTC (8 years, 10 months ago) by jhb
Branches: MAIN
Diff to: previous 1.65: preferred, colored
Changes since revision 1.65: +1 -1 lines
Use td->td_ucred instead of td->td_proc->p_ucred.

Revision 1.64: download - view: text, markup, annotated - select for diffs
Fri Mar 7 23:49:16 2003 UTC (8 years, 11 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.63: preferred, colored
Changes since revision 1.63: +3 -3 lines
Use the appropriate size when zeroing out the unused portion
of a snapshot's copy of a superblock. This patch fixes a panic
when taking a snapshot of a 4096/512 filesystem.

Reported by:	Ian Freislich <ianf@za.uu.net>
Sponsored by:   DARPA & NAI Labs.

Revision 1.63: download - view: text, markup, annotated - select for diffs
Tue Mar 4 00:04:43 2003 UTC (8 years, 11 months ago) by jeff
Branches: MAIN
Diff to: previous 1.62: preferred, colored
Changes since revision 1.62: +2 -2 lines
 - Add a new 'flags' parameter to getblk().
 - Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT
   flag to the initial BUF_LOCK().  This will eventually be used in cases
   were we want to use a buffer only if it is not currently in use.
 - Convert all consumers of the getblk() api to use this extra parameter.

Reviwed by:	arch
Not objected to by:	mckusick

Revision 1.62: download - view: text, markup, annotated - select for diffs
Sat Feb 22 00:59:34 2003 UTC (8 years, 11 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.61: preferred, colored
Changes since revision 1.61: +43 -7 lines
This patch fixes a deadlock between the bufdaemon and a process taking
a snapshot. As part of taking a snapshot of a filesystem, the kernel
builds up a list of the filesystem metadata (such as the cylinder
group bitmaps) that are contained in the snapshot. When doing a
copy-on-write check, the list is first consulted. If the block being
written is found on the list, then the full snapshot lookup can be
avoided. Besides providing an important performance speedup this
check also avoids a potential deadlock between the code creating
the snapshot and the bufdaemon trying to cleanup snapshot related
buffers. This fix creates a temporary list containing the key
metadata blocks that can cause the deadlock. This temporary list
is used between the time that the snapshot is first enabled and the
time that the fully complete list is built.

Reported by:	Attila Nagy <bra@fsn.hu>
Sponsored by:   DARPA & NAI Labs.

Revision 1.61: download - view: text, markup, annotated - select for diffs
Sat Feb 22 00:29:51 2003 UTC (8 years, 11 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.60: preferred, colored
Changes since revision 1.60: +4 -0 lines
This patch fixes a bug on an active filesystem on which a snapshot
is being taken from panicing with either "freeing free block" or
"freeing free inode". The problem arises when the snapshot code
is scanning the filesystem looking for inodes with a reference
count of zero (e.g., unlinked but still open) so that it can
expunge them from its view. If it encounters a reclaimed vnode
and has to restart its scan, then it will panic if it encounters
and tries to free an inode that it has already processed. The fix
is to check each candidate inode to see if it has already been
processed before trying to delete it from the snapshot image.

Sponsored by:   DARPA & NAI Labs.

Revision 1.60: download - view: text, markup, annotated - select for diffs
Sat Feb 22 00:19:26 2003 UTC (8 years, 11 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.59: preferred, colored
Changes since revision 1.59: +3 -3 lines
This patch fixes a bug in the logical block calculation macros so
that they convert to 64-bit values before shifting rather than
afterwards. Once fixed, they can be used rather than inline expanded.

Sponsored by:   DARPA & NAI Labs.

Revision 1.59: download - view: text, markup, annotated - select for diffs
Wed Feb 19 05:47:45 2003 UTC (8 years, 11 months ago) by imp
Branches: MAIN
Diff to: previous 1.58: preferred, colored
Changes since revision 1.58: +8 -8 lines
Back out M_* changes, per decision of the TRB.

Approved by: trb

Revision 1.58: download - view: text, markup, annotated - select for diffs
Tue Jan 21 08:56:15 2003 UTC (9 years ago) by alfred
Branches: MAIN
Diff to: previous 1.57: preferred, colored
Changes since revision 1.57: +8 -8 lines
Remove M_TRYWAIT/M_WAITOK/M_WAIT.  Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.

Revision 1.53.2.3: download - view: text, markup, annotated - select for diffs
Sat Dec 21 03:07:24 2002 UTC (9 years, 1 month ago) by mckusick
Branches: RELENG_5_0
CVS tags: RELENG_5_0_0_RELEASE
Diff to: previous 1.53.2.2: preferred, colored; branchpoint 1.53: preferred, colored; next MAIN 1.54: preferred, colored
Changes since revision 1.53.2.2: +12 -4 lines
MFC of revision 1.57 of sys/ufs/ffs/ffs_snapshot.c. This update
corrects a sign-post error introduced in revision 1.56. The effect
was to put an additional (incorrect) block into the lookup list for
each metablock in the snapshot. The extra incorrect block causes
background fsck to panic the kernel with "freeing free block".
For large filesystems, the large number of extra blocks can overrun
the area malloc'ed to hold the lookup list resulting in corruption
of the malloc arena.

Reported by:	Aurelien Nephtali <aurelien.nephtali@wanadoo.fr>
Sponsored by:   DARPA & NAI Labs.
Approved by:	re

Revision 1.57: download - view: text, markup, annotated - select for diffs
Wed Dec 18 19:50:28 2002 UTC (9 years, 1 month ago) by mckusick
Branches: MAIN
Diff to: previous 1.56: preferred, colored
Changes since revision 1.56: +12 -4 lines
Fix corruption introduced in previous delta.

Reported by:	Aurelien Nephtali <aurelien.nephtali@wanadoo.fr>
Sponsored by:   DARPA & NAI Labs.

Revision 1.56: download - view: text, markup, annotated - select for diffs
Wed Dec 18 07:19:41 2002 UTC (9 years, 1 month ago) by mckusick
Branches: MAIN
Diff to: previous 1.55: preferred, colored
Changes since revision 1.55: +4 -14 lines
Keep comments consistent with the code. Minor optimization.

Sponsored by:   DARPA & NAI Labs.

Revision 1.53.2.2: download - view: text, markup, annotated - select for diffs
Wed Dec 18 07:11:42 2002 UTC (9 years, 1 month ago) by mckusick
Branches: RELENG_5_0
Diff to: previous 1.53.2.1: preferred, colored; branchpoint 1.53: preferred, colored
Changes since revision 1.53.2.1: +4 -14 lines
Correctly apply the patch to ffs_snapshot.c approved earlier today.
Some day I will learn to extract things from CVS properly.

Sponsored by:   DARPA & NAI Labs.
Approved by:	re

Revision 1.53.2.1: download - view: text, markup, annotated - select for diffs
Tue Dec 17 22:36:33 2002 UTC (9 years, 1 month ago) by mckusick
Branches: RELENG_5_0
Diff to: previous 1.53: preferred, colored
Changes since revision 1.53: +183 -135 lines
Only the most recent snapshot contains the complete list of blocks
that were copied in all of the earlier snapshots, thus its precomputed
list must be used in the copyonwrite test. Using incomplete lists may
lead to deadlock. Also do not include the blocks used for the indirect
pointers in the indirect pointers as this may lead to inconsistent
snapshots.

Reviewed by:    Ian Dowse <iedowse@maths.tcd.ie>
Sponsored by:   DARPA & NAI Labs.
Approved by:    re
MFC from:	src/sys/sys/conf.h 1.151
MFC from:	src/sys/ufs/ufs/inode.h 1.42 & 1.43
MFC from:	src/sys/ufs/ffs/ffs_snapshot.c 1.54 & 1.55

Revision 1.55: download - view: text, markup, annotated - select for diffs
Sun Dec 15 19:25:59 2002 UTC (9 years, 1 month ago) by mckusick
Branches: MAIN
Diff to: previous 1.54: preferred, colored
Changes since revision 1.54: +6 -8 lines
Update to previous change (1.54) to use an approperly wide inode field
so as to work correctly on 64-bit platforms.

Reported-by:	Jake Burkholder <jake@locore.ca>
Sponsored by:   DARPA & NAI Labs.
Approved by:	Ian Dowse <iedowse@maths.tcd.ie>

Revision 1.54: download - view: text, markup, annotated - select for diffs
Sat Dec 14 01:36:59 2002 UTC (9 years, 2 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.53: preferred, colored
Changes since revision 1.53: +188 -138 lines
Only the most recent snapshot contains the complete list of blocks
that were copied in all of the earlier snapshots, thus its precomputed
list must be used in the copyonwrite test. Using incomplete lists may
lead to deadlock. Also do not include the blocks used for the indirect
pointers in the indirect pointers as this may lead to inconsistent
snapshots.

Sponsored by:   DARPA & NAI Labs.
Approved by:	re

Revision 1.53: download - view: text, markup, annotated - select for diffs
Tue Dec 3 18:19:27 2002 UTC (9 years, 2 months ago) by mckusick
Branches: MAIN
CVS tags: RELENG_5_0_BP
Branch point for: RELENG_5_0
Diff to: previous 1.52: preferred, colored
Changes since revision 1.52: +30 -24 lines
Have to use bread() rather than UFS_BALLOC() when obtaining a
previously allocated block as the previous use of the block may
have fallen out of the cache. Failure to reread its contents cause
zeroed results to be written instead of the proper contents.
Conversely, when the block is going to be entirely filled in, it
is not necessary reread the old contents.

Sponsored by:   DARPA & NAI Labs.
Approved by:	re

Revision 1.52: download - view: text, markup, annotated - select for diffs
Sat Nov 30 19:00:51 2002 UTC (9 years, 2 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.51: preferred, colored
Changes since revision 1.51: +112 -54 lines
Remove a race condition / deadlock from snapshots. When
converting from individual vnode locks to the snapshot
lock, be sure to pass any waiting processes along to the
new lock as well. This transfer is done by a new function
in the lock manager, transferlockers(from_lock, to_lock);
Thanks to Lamont Granquist <lamont@scriptkiddie.org> for
his help in pounding on snapshots beyond all reason and
finding this deadlock.

Sponsored by:   DARPA & NAI Labs.

Revision 1.51: download - view: text, markup, annotated - select for diffs
Sat Nov 30 07:27:12 2002 UTC (9 years, 2 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.50: preferred, colored
Changes since revision 1.50: +7 -2 lines
Fix two deadlocks in snapshots:

1) Release the snapshot file lock while suspending the system. Otherwise
   a process trying to read the lock may block on its containing directory
   preventing the suspension from completing. Thanks to Sean Kelly
   <smkelly@zombie.org> for finding this deadlock.

2) Replace some bdwrite's with bawrite's so as not to fill all the
   buffers with dirty data. The buffers could not be cleaned as the
   snapshot vnode was locked hence the system could deadlock when
   making snapshots of really massive filesystems. Thanks to
   Hidetoshi Shimokawa <simokawa@sat.t.u-tokyo.ac.jp> for figuring
   this out.

Sponsored by:   DARPA & NAI Labs.

Revision 1.50: download - view: text, markup, annotated - select for diffs
Wed Nov 27 02:18:58 2002 UTC (9 years, 2 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.49: preferred, colored
Changes since revision 1.49: +4 -4 lines
Create a new 32-bit fs_flags word in the superblock. Add code to move
the old 8-bit fs_old_flags to the new location the first time that the
filesystem is mounted by a new kernel. One of the unused flags in
fs_old_flags is used to indicate that the flags have been moved.
Leave the fs_old_flags word intact so that it will work properly if
used on an old kernel.

Change the fs_sblockloc superblock location field to be in units
of bytes instead of in units of filesystem fragments. The old units
did not work properly when the fragment size exceeeded the superblock
size (8192). Update old fs_sblockloc values at the same time that
the flags are moved.

Suggested by:	BOUWSMA Barry <freebsd-misuser@netscum.dyndns.dk>
Sponsored by:   DARPA & NAI Labs.

Revision 1.49: download - view: text, markup, annotated - select for diffs
Fri Nov 15 22:36:57 2002 UTC (9 years, 2 months ago) by peter
Branches: MAIN
Diff to: previous 1.48: preferred, colored
Changes since revision 1.48: +2 -2 lines
Do not assume that time_t is an int.

Approved by:	re (jhb)

Revision 1.48: download - view: text, markup, annotated - select for diffs
Fri Oct 25 00:20:37 2002 UTC (9 years, 3 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.47: preferred, colored
Changes since revision 1.47: +4 -1 lines
Within ufs, the ffs_sync and ffs_fsync functions did not always
check for and/or report I/O errors. The result is that a VFS_SYNC
or VOP_FSYNC called with MNT_WAIT could loop infinitely on ufs in
the presence of a hard error writing a disk sector or in a filesystem
full condition. This patch ensures that I/O errors will always be
checked and returned.  This patch also ensures that every call to
VFS_SYNC or VOP_FSYNC with MNT_WAIT set checks for and takes
appropriate action when an error is returned.

Sponsored by:   DARPA & NAI Labs.

Revision 1.47: download - view: text, markup, annotated - select for diffs
Tue Oct 22 01:23:00 2002 UTC (9 years, 3 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.46: preferred, colored
Changes since revision 1.46: +22 -13 lines
This update further fine tunes the locking of snapshot vnodes in
the ffs_copyonwrite routine to avoid a deadlock between the syncer
daemon trying to sync out a snapshot vnode and the bufdaemon
trying to write out a buffer containing the snapshot inode.
With any luck this will be the last snapshot race condition.

Sponsored by:	DARPA & NAI Labs.

Revision 1.46: download - view: text, markup, annotated - select for diffs
Tue Oct 22 00:59:49 2002 UTC (9 years, 3 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.45: preferred, colored
Changes since revision 1.45: +2 -2 lines
This checkin reimplements the io-request priority hack in a way
that works in the new threaded kernel. It was commented out of
the disksort routine earlier this year for the reasons given in
kern/subr_disklabel.c (which is where this code used to reside
before it moved to kern/subr_disk.c):


Revision 1.45: download - view: text, markup, annotated - select for diffs
Wed Oct 16 00:19:23 2002 UTC (9 years, 3 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.44: preferred, colored
Changes since revision 1.44: +91 -42 lines
Change locking so that all snapshots on a particular filesystem share
a common lock. This change avoids a deadlock between snapshots when
separate requests cause them to deadlock checking each other for a
need to copy blocks that are close enough together that they fall
into the same indirect block. Although I had anticipated a slowdown
from contention for the single lock, my filesystem benchmarks show
no measurable change in throughput on a uniprocessor system with
three active snapshots. I conjecture that this result is because
every copy-on-write fault must check all the active snapshots, so
the process was inherently serial already. This change removes the
last of the deadlocks of which I am aware in snapshots.

Sponsored by:	DARPA & NAI Labs.

Revision 1.44: download - view: text, markup, annotated - select for diffs
Wed Oct 9 12:19:36 2002 UTC (9 years, 4 months ago) by mux
Branches: MAIN
Diff to: previous 1.43: preferred, colored
Changes since revision 1.43: +3 -2 lines
Fix build of 64 bit platforms.

Revision 1.43: download - view: text, markup, annotated - select for diffs
Wed Oct 9 06:13:48 2002 UTC (9 years, 4 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.42: preferred, colored
Changes since revision 1.42: +135 -7 lines
When creating a snapshot, create a list of initially allocated blocks.
Whenever doing a copy-on-write check, first look in the list of
initially allocated blocks to see if it is there. If so, no further
check is needed. If not, fall through and do the full check. This
change eliminates one of two known deadlocks caused by snapshots.
Handling the second deadlock will be the subject of another check-in.
This change also reduces the cost of the copy-on-write check by
speeding up the verification of frequently checked blocks.

Sponsored by:	DARPA & NAI Labs.

Revision 1.42: download - view: text, markup, annotated - select for diffs
Tue Oct 8 21:00:52 2002 UTC (9 years, 4 months ago) by jeff
Branches: MAIN
Diff to: previous 1.41: preferred, colored
Changes since revision 1.41: +1 -1 lines
 - Remove LK_INTERLOCK from the vn_lock() in ffs_snapshot().

Pointy hat to:	me
Found by:	green

Revision 1.41: download - view: text, markup, annotated - select for diffs
Wed Sep 25 02:47:49 2002 UTC (9 years, 4 months ago) by jeff
Branches: MAIN
Diff to: previous 1.40: preferred, colored
Changes since revision 1.40: +2 -3 lines
 - Document broken locking.
 - Use vrefcnt().

Revision 1.40: download - view: text, markup, annotated - select for diffs
Fri Sep 20 16:42:33 2002 UTC (9 years, 4 months ago) by phk
Branches: MAIN
Diff to: previous 1.39: preferred, colored
Changes since revision 1.39: +0 -2 lines
We don't need to #include <sys/disklabel.h>.
We don't need to #include <sys/disklabel.h> second time either.

Sponsored by:	DARPA & NAI Labs.

Revision 1.39: download - view: text, markup, annotated - select for diffs
Sun Aug 4 10:29:35 2002 UTC (9 years, 6 months ago) by jeff
Branches: MAIN
Diff to: previous 1.38: preferred, colored
Changes since revision 1.38: +12 -6 lines
 - Replace v_flag with v_iflag and v_vflag
 - v_vflag is protected by the vnode lock and is used when synchronization
   with VOP calls is needed.
 - v_iflag is protected by interlock and is used for dealing with vnode
   management issues.  These flags include X/O LOCK, FREE, DOOMED, etc.
 - All accesses to v_iflag and v_vflag have either been locked or marked with
   mp_fixme's.
 - Many ASSERT_VOP_LOCKED calls have been added where the locking was not
   clear.
 - Many functions in vfs_subr.c were restructured to provide for stronger
   locking.

Idea stolen from:	BSD/OS

Revision 1.38: download - view: text, markup, annotated - select for diffs
Sun Jun 23 18:17:26 2002 UTC (9 years, 7 months ago) by mux
Branches: MAIN
Diff to: previous 1.37: preferred, colored
Changes since revision 1.37: +4 -4 lines
Warning fixes for 64 bits platforms.  This eliminates all the
warnings I have had in the FFS code on sparc64.

Reviewed by:	mckusick

Revision 1.37: download - view: text, markup, annotated - select for diffs
Sun Jun 23 06:12:21 2002 UTC (9 years, 7 months ago) by dillon
Branches: MAIN
Diff to: previous 1.36: preferred, colored
Changes since revision 1.36: +12 -12 lines
Rename the BALLOC flags from B_* to BA_* to avoid confusion with the
struct buf B_ flags.

Approved by:	mckusick

Revision 1.36: download - view: text, markup, annotated - select for diffs
Fri Jun 21 06:18:03 2002 UTC (9 years, 7 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.35: preferred, colored
Changes since revision 1.35: +474 -137 lines
This commit adds basic support for the UFS2 filesystem. The UFS2
filesystem expands the inode to 256 bytes to make space for 64-bit
block pointers. It also adds a file-creation time field, an ability
to use jumbo blocks per inode to allow extent like pointer density,
and space for extended attributes (up to twice the filesystem block
size worth of attributes, e.g., on a 16K filesystem, there is space
for 32K of attributes). UFS2 fully supports and runs existing UFS1
filesystems. New filesystems built using newfs can be built in either
UFS1 or UFS2 format using the -O option. In this commit UFS1 is
the default format, so if you want to build UFS2 format filesystems,
you must specify -O 2. This default will be changed to UFS2 when
UFS2 proves itself to be stable. In this commit the boot code for
reading UFS2 filesystems is not compiled (see /sys/boot/common/ufsread.c)
as there is insufficient space in the boot block. Once the size of the
boot block is increased, this code can be defined.

Things to note: the definition of SBSIZE has changed to SBLOCKSIZE.
The header file <ufs/ufs/dinode.h> must be included before
<ufs/ffs/fs.h> so as to get the definitions of ufs2_daddr_t and
ufs_lbn_t.

Still TODO:
Verify that the first level bootstraps work for all the architectures.
Convert the utility ffsinfo to understand UFS2 and test growfs.
Add support for the extended attribute storage. Update soft updates
to ensure integrity of extended attribute storage. Switch the
current extended attribute interfaces to use the extended attribute
storage. Add the extent like functionality (framework is there,
but is currently never used).

Sponsored by: DARPA & NAI Labs.
Reviewed by:	Poul-Henning Kamp <phk@freebsd.org>

Revision 1.35: download - view: text, markup, annotated - select for diffs
Sun May 12 20:21:40 2002 UTC (9 years, 9 months ago) by phk
Branches: MAIN
Diff to: previous 1.34: preferred, colored
Changes since revision 1.34: +1 -0 lines
ARGH!  SBLOCK is not unused.  Try to get this right.

BBSIZE belongs in <sys/disklabel.h> (but shouldn't be a constant).

Define SBLOCK again, using the right math.

Sponsored by: DARPA & NAI Labs.

Revision 1.65: download - view: text, markup, annotated - select for diffs
Mon Apr 22 06:53:20 2002 UTC (9 years, 9 months ago) by phk
Branches: MAIN
Diff to: previous 1.64: preferred, colored
Changes since revision 1.64: +5 -0 lines
Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.
Comment out Kirks io-request priority hack until we can do this in a
civilized way which doesn't cause grief.

The problem is that it is not generally safe to cast a "struct bio
*" to a "struct buf *".  Things like ccd, vinum, ata-raid and GEOM
constructs bio's which are not entrails of a struct buf.

Also, curthread may or may not have anything to do with the I/O request
at hand.

The correct solution can either be to tag struct bio's with a
priority derived from the requesting threads nice and have disksort
act on this field, this wouldn't address the "silly-seek syndrome"
where two equal processes bang the diskheads from one edge to the
other of the disk repeatedly.

Alternatively, and probably better: a sleep should be introduced
either at the time the I/O is requested or at the time it is completed
where we can be sure to sleep in the right thread.

The sleep also needs to be in constant timeunits, 1/hz can be practicaly
any sub-second size, at high HZ the current code practically doesn't
do anything.

Revision 1.65: download - view: text, markup, annotated - select for diffs
Mon Apr 22 06:53:20 2002 UTC (9 years, 9 months ago) by phk
Branches: MAIN
Diff to: previous 1.64: preferred, colored
Changes since revision 1.64: +5 -0 lines
Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.
Comment out Kirks io-request priority hack until we can do this in a
civilized way which doesn't cause grief.

The problem is that it is not generally safe to cast a "struct bio
*" to a "struct buf *".  Things like ccd, vinum, ata-raid and GEOM
constructs bio's which are not entrails of a struct buf.

Also, curthread may or may not have anything to do with the I/O request
at hand.

The correct solution can either be to tag struct bio's with a
priority derived from the requesting threads nice and have disksort
act on this field, this wouldn't address the "silly-seek syndrome"
where two equal processes bang the diskheads from one edge to the
other of the disk repeatedly.

Alternatively, and probably better: a sleep should be introduced
either at the time the I/O is requested or at the time it is completed
where we can be sure to sleep in the right thread.

The sleep also needs to be in constant timeunits, 1/hz can be practicaly
any sub-second size, at high HZ the current code practically doesn't
do anything.

Revision 1.34: download - view: text, markup, annotated - select for diffs
Tue Mar 19 22:40:46 2002 UTC (9 years, 10 months ago) by alfred
Branches: MAIN
Diff to: previous 1.33: preferred, colored
Changes since revision 1.33: +19 -19 lines
Remove __P.

Revision 1.33: download - view: text, markup, annotated - select for diffs
Tue Mar 19 04:09:21 2002 UTC (9 years, 10 months ago) by bde
Branches: MAIN
Diff to: previous 1.32: preferred, colored
Changes since revision 1.32: +6 -5 lines
Fixed some printf format errors (hopefully all of the remaining daddr64_t
ones for GENERIC, and all others on the same line as those).  Reformat
the printfs if necessary to avoid new long lones or old format printf
errors.

Revision 1.32: download - view: text, markup, annotated - select for diffs
Sun Mar 17 01:25:46 2002 UTC (9 years, 10 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.31: preferred, colored
Changes since revision 1.31: +2 -1 lines
Add a flags parameter to VFS_VGET to pass through the desired
locking flags when acquiring a vnode. The immediate purpose is
to allow polling lock requests (LK_NOWAIT) needed by soft updates
to avoid deadlock when enlisting other processes to help with
the background cleanup. For the future it will allow the use of
shared locks for read access to vnodes. This change touches a
lot of files as it affects most filesystems within the system.
It has been well tested on FFS, loopback, and CD-ROM filesystems.
only lightly on the others, so if you find a problem there, please
let me (mckusick@mckusick.com) know.

Revision 1.31: download - view: text, markup, annotated - select for diffs
Fri Mar 15 18:49:46 2002 UTC (9 years, 10 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.30: preferred, colored
Changes since revision 1.30: +4 -4 lines
Introduce the new 64-bit size disk block, daddr64_t. Change
the bio and buffer structures to have daddr64_t bio_pblkno,
b_blkno, and b_lblkno fields which allows access to disks
larger than a Terabyte in size. This change also requires
that the VOP_BMAP vnode operation accept and return daddr64_t
blocks. This delta should not affect system operation in
any way. It merely sets up the necessary interfaces to allow
the development of disk drivers that work with these larger
disk block addresses. It also allows for the development of
UFS2 which will use 64-bit block addresses.

Revision 1.30: download - view: text, markup, annotated - select for diffs
Wed Feb 27 19:18:10 2002 UTC (9 years, 11 months ago) by jhb
Branches: MAIN
Diff to: previous 1.29: preferred, colored
Changes since revision 1.29: +1 -1 lines
Use thread0.td_ucred instead of proc0.p_ucred.  This change is cosmetic
and isn't strictly required.  However, it lowers the number of false
positives found when grep'ing the kernel sources for p_ucred to ensure
proper locking.

Revision 1.29: download - view: text, markup, annotated - select for diffs
Wed Feb 27 18:32:21 2002 UTC (9 years, 11 months ago) by jhb
Branches: MAIN
Diff to: previous 1.28: preferred, colored
Changes since revision 1.28: +1 -1 lines
Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.

Revision 1.28: download - view: text, markup, annotated - select for diffs
Mon Feb 11 20:37:54 2002 UTC (10 years ago) by julian
Branches: MAIN
Diff to: previous 1.27: preferred, colored
Changes since revision 1.27: +2 -2 lines
In  a threaded world, differnt priorirites become properties of
different entities.  Make it so.

Reviewed by:	jhb@freebsd.org (john baldwin)

Revision 1.27: download - view: text, markup, annotated - select for diffs
Sat Feb 2 01:42:44 2002 UTC (10 years ago) by mckusick
Branches: MAIN
Diff to: previous 1.26: preferred, colored
Changes since revision 1.26: +215 -161 lines
When taking a snapshot, we must check for active files that have
been unlinked (e.g., with a zero link count). We have to expunge
all trace of these files from the snapshot so that they are neither
reclaimed prematurely by fsck nor saved unnecessarily by dump.

Revision 1.26: download - view: text, markup, annotated - select for diffs
Thu Jan 17 08:33:32 2002 UTC (10 years ago) by mckusick
Branches: MAIN
Diff to: previous 1.25: preferred, colored
Changes since revision 1.25: +2 -2 lines
Fix a bug introduced in ffs_snapshot.c -r1.25 and fs.h -r1.26
which caused incomplete snapshots to be taken. When background
fsck would run on these snapshots, the result would be files
being incorrectly released which would subsequently panic the
kernel with ``handle_workitem_freefile: inodedep survived'',
``handle_written_inodeblock: live inodedep'', and
``handle_workitem_remove: lost inodedep'' errors.

Revision 1.25: download - view: text, markup, annotated - select for diffs
Tue Dec 18 18:05:17 2001 UTC (10 years, 1 month ago) by mckusick
Branches: MAIN
Diff to: previous 1.24: preferred, colored
Changes since revision 1.24: +4 -4 lines
Change the atomic_set_char to atomic_set_int and atomic_clear_char
to atomic_clear_int to ease the implementation for the sparc64.

Requested by:	Jake Burkholder <jake@locore.ca>

Revision 1.24: download - view: text, markup, annotated - select for diffs
Fri Dec 14 00:15:06 2001 UTC (10 years, 2 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.23: preferred, colored
Changes since revision 1.23: +193 -96 lines
Minimize the time necessary to suspend operations on a filesystem
when taking a snapshot. The two time consuming operations are
scanning all the filesystem bitmaps to determine which blocks
are in use and scanning all the other snapshots so as to be able
to expunge their blocks from the view of the current snapshot.
The bitmap scanning is broken into two passes. Before suspending
the filesystem all bitmaps are scanned. After the suspension,
those bitmaps that changed after being scanned the first time
are rescanned. Typically there are few bitmaps that need to be
rescanned. The expunging of other snapshots is now done after
the suspension is released by observing that we can easily
identify any blocks that were allocated to them after the
suspension (they will be maked as `not needing to be copied'
in the just created snapshot). For all the gory details, see
the ``Running fsck in the Background'' paper in the Usenix
BSDCon 2002 Conference Proceedings, pages 55-64.

Revision 1.23: download - view: text, markup, annotated - select for diffs
Wed Sep 12 08:38:07 2001 UTC (10 years, 5 months ago) by julian
Branches: MAIN
CVS tags: KSE_MILESTONE_2
Diff to: previous 1.22: preferred, colored
Changes since revision 1.22: +54 -54 lines
KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha

Revision 1.22: download - view: text, markup, annotated - select for diffs
Mon May 14 17:16:49 2001 UTC (10 years, 9 months ago) by mckusick
Branches: MAIN
CVS tags: KSE_PRE_MILESTONE_2
Diff to: previous 1.21: preferred, colored
Changes since revision 1.21: +21 -8 lines
Further fixes for deadlock in the presence of multiple snapshots.
There are still more to find, but this fix should cover the
common cases that folks are hitting.

Revision 1.21: download - view: text, markup, annotated - select for diffs
Fri May 11 07:12:03 2001 UTC (10 years, 9 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.20: preferred, colored
Changes since revision 1.20: +7 -4 lines
Remove yet another deadlock case.

Revision 1.20: download - view: text, markup, annotated - select for diffs
Tue May 8 07:29:03 2001 UTC (10 years, 9 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.19: preferred, colored
Changes since revision 1.19: +20 -11 lines
Several fixes for units errors:
1) Do not assume that the superblock will be of size fs->fs_bsize.
   This fixes a panic when taking a snapshot on a filesystem with
   a block size bigger than 8K.
2) Properly calculate the number of fragments that follow the
   superblock summary information. This fixes a bug with inconsistent
   snapshots.
3) When cleaning up a snapshot that is about to be removed, properly
   calculate the number of blocks that need to be checked. This fixes
   a bug that created partially allocated inodes.
4) When moving blocks from a snapshot that is about to be removed
   to another snapshot, properly account for the reduced number of
   blocks in the snapshot from which they are taken. This fixes a
   bug in which the number of blocks released from a snapshot did not
   match the number that it claimed to have.

Revision 1.19: download - view: text, markup, annotated - select for diffs
Fri May 4 05:49:28 2001 UTC (10 years, 9 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.18: preferred, colored
Changes since revision 1.18: +226 -118 lines
Refinement to revision 1.16 of ufs/ffs/ffs_snapshot.c to reduce
the amount of time that the filesystem must be suspended. The
current snapshot is elided as well as the earlier snapshots.

Revision 1.18: download - view: text, markup, annotated - select for diffs
Sun Apr 29 12:36:48 2001 UTC (10 years, 9 months ago) by phk
Branches: MAIN
Diff to: previous 1.17: preferred, colored
Changes since revision 1.17: +22 -22 lines
VOP_BALLOC was never really a VOP in the first place, so convert it
to UFS_BALLOC like the other "between UFS and FFS function interfaces".

Revision 1.17: download - view: text, markup, annotated - select for diffs
Sun Apr 29 02:45:12 2001 UTC (10 years, 9 months ago) by grog
Branches: MAIN
Diff to: previous 1.16: preferred, colored
Changes since revision 1.16: +1 -3 lines
Revert consequences of changes to mount.h, part 2.

Requested by:	bde

Revision 1.16: download - view: text, markup, annotated - select for diffs
Thu Apr 26 00:50:53 2001 UTC (10 years, 9 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.15: preferred, colored
Changes since revision 1.15: +20 -36 lines
Rather than copying all the indirect blocks of the snapshot,
simply mark them as BLK_NOCOPY. This trick cuts the initial
size of the snapshot in half and cuts the time to take a
snapshot by a third.

Revision 1.15: download - view: text, markup, annotated - select for diffs
Wed Apr 25 08:11:17 2001 UTC (10 years, 9 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.14: preferred, colored
Changes since revision 1.14: +67 -18 lines
When closing the last reference to an unlinked file, it is freed
by the inactive routine. Because the freeing causes the filesystem
to be modified, the close must be held up during periods when the
filesystem is suspended.

For snapshots to be consistent across crashes, they must write
blocks that they copy and claim those written blocks in their
on-disk block pointers before the old blocks that they referenced
can be allowed to be written.

Close a loophole that allowed unwritten blocks to be skipped when
doing ffs_sync with a request to wait for all I/O activity to be
completed.

Revision 1.14: download - view: text, markup, annotated - select for diffs
Mon Apr 23 08:58:56 2001 UTC (10 years, 9 months ago) by grog
Branches: MAIN
Diff to: previous 1.13: preferred, colored
Changes since revision 1.13: +3 -1 lines
Correct #includes to work with fixed sys/mount.h.

Revision 1.13: download - view: text, markup, annotated - select for diffs
Sat Apr 14 05:26:28 2001 UTC (10 years, 10 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.12: preferred, colored
Changes since revision 1.12: +2 -2 lines
This checkin adds support in ufs/ffs for the FS_NEEDSFSCK flag.
It is described in ufs/ffs/fs.h as follows:

/*
 * Filesystem flags.
 *
 * Note that the FS_NEEDSFSCK flag is set and cleared only by the
 * fsck utility. It is set when background fsck finds an unexpected
 * inconsistency which requires a traditional foreground fsck to be
 * run. Such inconsistencies should only be found after an uncorrectable
 * disk error. A foreground fsck will clear the FS_NEEDSFSCK flag when
 * it has successfully cleaned up the filesystem. The kernel uses this
 * flag to enforce that inconsistent filesystems be mounted read-only.
 */
#define FS_UNCLEAN    0x01	/* filesystem not clean at mount */
#define FS_DOSOFTDEP  0x02	/* filesystem using soft dependencies */
#define FS_NEEDSFSCK  0x04	/* filesystem needs sync fsck before mount */

Revision 1.12: download - view: text, markup, annotated - select for diffs
Wed Mar 21 04:05:20 2001 UTC (10 years, 10 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.11: preferred, colored
Changes since revision 1.11: +21 -17 lines
Clear the fs_clean flag only when the FS_UNCLEAN flag is not set
(as is done in unmount).

Remove a snapshot inode from the superblock list when its last
name goes away rather than when its last reference goes away.
That way it will be properly reclaimed by fsck after a crash
rather than reenabled when the filesystem is mounted.

Revision 1.11: download - view: text, markup, annotated - select for diffs
Wed Mar 7 07:09:54 2001 UTC (10 years, 11 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.10: preferred, colored
Changes since revision 1.10: +49 -50 lines
Fixes to track snapshot copy-on-write checking in the specinfo
structure rather than assuming that the device vnode would reside
in the FFS filesystem (which is obviously a broken assumption with
the device filesystem).

Revision 1.10: download - view: text, markup, annotated - select for diffs
Mon Feb 12 00:20:07 2001 UTC (11 years ago) by jake
Branches: MAIN
Diff to: previous 1.9: preferred, colored
Changes since revision 1.9: +3 -3 lines
Implement a unified run queue and adjust priority levels accordingly.

- All processes go into the same array of queues, with different
  scheduling classes using different portions of the array.  This
  allows user processes to have their priorities propogated up into
  interrupt thread range if need be.
- I chose 64 run queues as an arbitrary number that is greater than
  32.  We used to have 4 separate arrays of 32 queues each, so this
  may not be optimal.  The new run queue code was written with this
  in mind; changing the number of run queues only requires changing
  constants in runq.h and adjusting the priority levels.
- The new run queue code takes the run queue as a parameter.  This
  is intended to be used to create per-cpu run queues.  Implement
  wrappers for compatibility with the old interface which pass in
  the global run queue structure.
- Group the priority level, user priority, native priority (before
  propogation) and the scheduling class into a struct priority.
- Change any hard coded priority levels that I found to use
  symbolic constants (TTIPRI and TTOPRI).
- Remove the curpriority global variable and use that of curproc.
  This was used to detect when a process' priority had lowered and
  it should yield.  We now effectively yield on every interrupt.
- Activate propogate_priority().  It should now have the desired
  effect without needing to also propogate the scheduling class.
- Temporarily comment out the call to vm_page_zero_idle() in the
  idle loop.  It interfered with propogate_priority() because
  the idle process needed to do a non-blocking acquire of Giant
  and then other processes would try to propogate their priority
  onto it.  The idle process should not do anything except idle.
  vm_page_zero_idle() will return in the form of an idle priority
  kernel thread which is woken up at apprioriate times by the vm
  system.
- Update struct kinfo_proc to the new priority interface.  Deliberately
  change its size by adjusting the spare fields.  It remained the same
  size, but the layout has changed, so userland processes that use it
  would parse the data incorrectly.  The size constraint should really
  be changed to an arbitrary version number.  Also add a debug.sizeof
  sysctl node for struct kinfo_proc.

Revision 1.9: download - view: text, markup, annotated - select for diffs
Mon Jan 15 18:30:40 2001 UTC (11 years ago) by iedowse
Branches: MAIN
Diff to: previous 1.8: preferred, colored
Changes since revision 1.8: +5 -2 lines
The ffs superblock includes a 128-byte region for use by temporary
in-core pointers to summary information. An array in this region
(fs_csp) could overflow on filesystems with a very large number of
cylinder groups (~16000 on i386 with 8k blocks). When this happens,
other fields in the superblock get corrupted, and fsck refuses to
check the filesystem.

Solve this problem by replacing the fs_csp array in 'struct fs'
with a single pointer, and add padding to keep the length of the
128-byte region fixed. Update the kernel and userland utilities
to use just this single pointer.

With this change, the kernel no longer makes use of the superblock
fields 'fs_csshift' and 'fs_csmask'. Add a comment to newfs/mkfs.c
to indicate that these fields must be calculated for compatibility
with older kernels.

Reviewed by:	mckusick

Revision 1.8: download - view: text, markup, annotated - select for diffs
Fri Jan 12 21:56:55 2001 UTC (11 years, 1 month ago) by mckusick
Branches: MAIN
Diff to: previous 1.7: preferred, colored
Changes since revision 1.7: +2 -2 lines
Properly compute the size of the final block of superblock summary information.

Submitted by:	Ian Dowse <iedowse@maths.tcd.ie>

Revision 1.7: download - view: text, markup, annotated - select for diffs
Tue Dec 19 04:41:02 2000 UTC (11 years, 1 month ago) by mckusick
Branches: MAIN
Diff to: previous 1.6: preferred, colored
Changes since revision 1.6: +37 -15 lines
Several small but important fixes for snapshots:

1) Be more tolerant of missing snapshot files by only trying to decrement
   their reference count if they are registered as active.

2) Fix for snapshots of filesystems with block sizes larger than 8K
   (from Ollivier Robert <roberto@eurocontrol.fr>).

3) Fix to avoid losing last block in snapshot file when calculating blocks
   that need to be copied (from Don Coleman <coleman@coleman.org>).

Revision 1.6: download - view: text, markup, annotated - select for diffs
Sun Sep 17 19:41:26 2000 UTC (11 years, 4 months ago) by des
Branches: MAIN
Diff to: previous 1.5: preferred, colored
Changes since revision 1.5: +2 -2 lines
Silence a warning.

Revision 1.5: download - view: text, markup, annotated - select for diffs
Thu Sep 7 01:33:01 2000 UTC (11 years, 5 months ago) by jasone
Branches: MAIN
Diff to: previous 1.4: preferred, colored
Changes since revision 1.4: +1 -2 lines
Major update to the way synchronization is done in the kernel.  Highlights
include:

* Mutual exclusion is used instead of spl*().  See mutex(9).  (Note: The
  alpha port is still in transition and currently uses both.)

* Per-CPU idle processes.

* Interrupts are run in their own separate kernel threads and can be
  preempted (i386 only).

Partially contributed by:	BSDi (BSD/OS)
Submissions by (at least):	cp, dfr, dillon, grog, jake, jhb, sheldonh

Revision 1.4: download - view: text, markup, annotated - select for diffs
Wed Jul 26 23:06:50 2000 UTC (11 years, 6 months ago) by mckusick
Branches: MAIN
CVS tags: PRE_SMPNG
Diff to: previous 1.3: preferred, colored
Changes since revision 1.3: +7 -9 lines
Clean up the snapshot code so that it no longer depends on the use of
the SF_IMMUTABLE flag to prevent writing. Instead put in explicit
checking for the SF_SNAPSHOT flag in the appropriate places. With
this change, it is now possible to rename and link to snapshot files.
It is also possible to set or clear any of the owner, group, or
other read bits on the file, though none of the write or execute
bits can be set. There is also an explicit test to prevent the
setting or clearing of the SF_SNAPSHOT flag via chflags() or
fchflags(). Note also that the modify time cannot be changed as
it needs to accurately reflect the time that the snapshot was taken.

Submitted by:	Robert Watson <rwatson@FreeBSD.org>

Revision 1.3: download - view: text, markup, annotated - select for diffs
Mon Jul 24 05:28:31 2000 UTC (11 years, 6 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.2: preferred, colored
Changes since revision 1.2: +47 -22 lines
This patch corrects the first round of panics and hangs reported
with the new snapshot code.

Update addaliasu to correctly implement the semantics of the old
checkalias function. When a device vnode first comes into existence,
check to see if an anonymous vnode for the same device was created
at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than
creating a new vnode for the device. This corrects a problem which
caused the kernel to panic when taking a snapshot of the root
filesystem.

Change the calling convention of vn_write_suspend_wait() to be the
same as vn_start_write().

Split out softdep_flushworklist() from softdep_flushfiles() so that
it can be used to clear the work queue when suspending filesystem
operations.

Access to buffers becomes recursive so that snapshots can recursively
traverse their indirect blocks using ffs_copyonwrite() when checking
for the need for copy on write when flushing one of their own indirect
blocks. This eliminates a deadlock between the syncer daemon and a
process taking a snapshot.

Ensure that softdep_process_worklist() can never block because of a
snapshot being taken. This eliminates a problem with buffer starvation.

Cleanup change in ffs_sync() which did not synchronously wait when
MNT_WAIT was specified. The result was an unclean filesystem panic
when doing forcible unmount with heavy filesystem I/O in progress.

Return a zero'ed block when reading a block that was not in use at
the time that a snapshot was taken. Normally, these blocks should
never be read. However, the readahead code will occationally read
them which can cause unexpected behavior.

Clean up the debugging code that ensures that no blocks be written
on a filesystem while it is suspended. Snapshots must explicitly
label the blocks that they are writing during the suspension so that
they do not cause a `write on suspended filesystem' panic.

Reorganize ffs_copyonwrite() to eliminate a deadlock and also to
prevent a race condition that would permit the same block to be
copied twice. This change eliminates an unexpected soft updates
inconsistency in fsck caused by the double allocation.

Use bqrelse rather than brelse for buffers that will be needed
soon again by the snapshot code. This improves snapshot performance.

Revision 1.2: download - view: text, markup, annotated - select for diffs
Wed Jul 12 00:27:27 2000 UTC (11 years, 7 months ago) by mckusick
Branches: MAIN
Diff to: previous 1.1: preferred, colored
Changes since revision 1.1: +6 -5 lines
Brain fault, forgot to update ffs_snapshot.c with the new calling convention
for vn_start_write.

Revision 1.1: download - view: text, markup, annotated - select for diffs
Tue Jul 11 22:07:54 2000 UTC (11 years, 7 months ago) by mckusick
Branches: MAIN
Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).

Diff request

This form allows you to request diffs between any two revisions of a file. You may select a symbolic revision name using the selection box or you may type in a numeric name using the type-in text box.

Log view options