Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define a build protocol for single package build #1

Open
hegner opened this issue Mar 17, 2015 · 48 comments
Open

Define a build protocol for single package build #1

hegner opened this issue Mar 17, 2015 · 48 comments

Comments

@hegner
Copy link
Member

hegner commented Mar 17, 2015

The build of individual packages should follow a common 'protocol'. This issue is to discuss how it could look like.

@ktf
Copy link
Contributor

ktf commented Mar 17, 2015

I would personally be happy with something which allows me to build externals with:

EXTERNAL_ROOT1=/some/path
EXTERNAL_VERSION=vX.Y.Z
OPTION1=someoption
PREFIX=/unique/build/prefix
...
./build.sh

and install them with:

./install.sh

if it needs to become:

cmake -DEXTERNAL_ROOT1=/some/path \
            -DEXTERNAL_VERSION=vX.Y.Z \
            -DOPTION1=someoption \
            -DPREFIX=/unique/build/prefix
            ....
make
make install

for the sake of collaboration, I'd be happy as well. The real questions I can think of on my side are:

  • Why do you think it's worth to add a (binary) dependency on cmake for this? What does using CMake add as a value compared to a shell script? How do we build / deploy the initial CMake?

  • What externals out of the ones in cms-sw/cmsdist do you think I can build that way right now (at the moment not even root or geant4 are so easy for us, can you comment if we are doing anything wrong?)?

  • Apart from:

    • EXTERNALX_ROOT: the installation path of one the requirements.
    • EXTERNALX_VERSION: the version of one of the requirements.
    • PREFIX: where built software is installed for packaging.
    • JOBS: the number of parallel jobs make does.

    which I think we all agree on, what other options do we want to be part of the interface?

@hegner
Copy link
Member Author

hegner commented Mar 18, 2015

I like that idea. There are however a few more things we need to consider - platform, compiler, switches like verbosity, build type, a pointer to the source, ...
And one of the more general questions - do we work with options or with environment variables?

What concerns cmake - The ExternalPackage_Add has a nice 'protocol' one can get inspiration from. Whether the scripts use cmake internally or not is at first order an implementation detail to me. First let's get the interface agreed upon.

@sbinet
Copy link

sbinet commented Mar 18, 2015

do we need a CMMI interface? (configure-make-make-install)
do we need just that?

with CVMFS and other (non-root) overlay-fs technologies, I think we could rather define a canonical installation path (/opt/sw/) which would tremendously simplify build+installation+use procedures.

corollary, a proper build protocol, taking into account the full set of options for build reproducibilities should probably be along the lines of GUIX (which makes a hash from that set. see: http://en.wikipedia.org/wiki/Nix_package_manager )

@peremato
Copy link
Member

Typically things are nor as simple as CMMI. You need as Benedikt pointed out a number of parameters to decide the how to do the CMMI and more importantly the dependencies to other packages. I agree with Sebastien that an standard installation location helps to handle the dependencies but not the build instructions themselves. You need to to locate the packages that you depend on considering the full set of dependencies.
I agree that taking into account all dependencies (and their versions) and possible options is a must and a hash number should be generated similarly to what we currently do with the LCGCmake, which is very similar to what Nix system does.
For example something as simple as building swig requires some work:

LCGPackage_Add(
  swig
  URL http://service-spi.web.cern.ch/service-spi/external/tarFiles/swig-<VERSION>.tar.gz
  IF <VERSION> VERSION_LESS 2.0.0 THEN
    CONFIGURE_COMMAND <SOURCE_DIR>/configure --prefix=<INSTALL_DIR>
  ELSE
    CONFIGURE_COMMAND <SOURCE_DIR>/configure --prefix=<INSTALL_DIR> --with-pcre-prefix=${pcre_home} PCRE_LIBS=${pcre_home}/lib/libpcre.a
    BUILD_COMMAND ${MAKE} CPPFLAGS=-I${pcre_home}/include
    INSTALL_COMMAND ${MAKE} install
    DEPENDS pcre
  ENDIF
 )

For some recent versions it depends on pcre, so some convention on where to find the installation of the adequate version of pcre will be required.

Commenting Giulio proposal, I think what we need is a 'declarative' protocol instead of a scripting or execution protocol. We need to know all the details on where the sources are, what is the command to build, to install, to patch, etc. and is the system that will execute them in the adequate order and injecting additional commands at adequate points (creation os RPMs, applying patches, making the installation relocatable, creating and collecting log files, etc.). So, a solution based on a set of build.sh is not really sufficient. This is why I think that defining we need to define the 'data' for each package. In a way we could write our CMake LCGPackage_Add() as:

NAME = swig
SOURCE_URL =  http://service-spi.web.cern.ch/service-spi/external/tarFiles/swig-<VERSION>.tar.gz
CONFIGURE_COMMAND =  IF <VERSION> VERSION_LESS 2.0.0 THEN <SOURCE_DIR>/configure --prefix=<INSTALL_DIR> ELSE <SOURCE_DIR>/configure --prefix=<INSTALL_DIR> --with-pcre-prefix=${pcre_home} PCRE_LIBS=${pcre_home}/lib/libpcre.a
BUILD_COMMAND = make CPPFLAGS=-I${pcre_home}/include
INSTALL_COMMAND = make install
DEPENDS = pcre

@ktf
Copy link
Contributor

ktf commented Mar 18, 2015

@hegner

I agree about platform (HEP_PLATFORM?), compiler (HEP_COMPILER?), verbosity (HEP_VERBOSITY?). I'm not sure what you mean by "build type". Is this debug?

I strongly disagree about including anything related to sources in the "build protocol" that should be part of a different "protocol" (configuration?), IMHO. Different tools do fetching in different ways, and if you start mixing build and fetching you end up reinventing the wheel. How would you plug fetching in (say) homebrew? They have their own caching and fetching mechanism (just like nix, cmsBuild, or any other build system out there). IMHO the "build protocol" should simply assume sources are unpacked and you are inside the source directory.

@peremato Apart from the comment above about sources, I think the complication you talk about comes from the fact you want to have different behaviour for different versions rather than extended behaviour based on the context.

Your example in my mind should really become:

# In ./build.sh
./configure ${INSTALL_DIR+--prefix $INSTALL_DIR} \
            ${PCRE_ROOT+--with-pcre-prefix=$PCRE_ROOT PCRE_LIBS=$PCRE_ROOT/lib/libpcre.a } 

make CPPFLAGS="${PCRE_ROOT+-I$PCRE_ROOT/include}"
make install

this script could be used by any build system (including for example homebrew or nix) without modifications, just by changing the context they build in. This would be pluggable in a bunch of different build systems, to the limit it could simply work by doing:

INSTALL_DIR=/usr/local
PCRE_ROOT=/usr
./build.sh

The above is really a declarative recipe, to the point it could be written as:

{
  "INSTALL_DIR": "/usr/local",
  "PCRE_ROOT": "/usr",
  "BUILD_SCRIPT": "./build.sh",
}

which in the end is very similar to what Nix does. Your example is putting an if statement inside a variable, which is as "non-declarative" as it can get, IMHO. To be clear, I think my idea is: parts which are really declarative should go in the "configuration" level, parts which are non-declarative go in the "build recipe" level.

Apart from that I've also the following comments:

  • Making the installation relocatable in my mind should be part of either the patching phase (which then can be fed back upstream) or it should be handled automatically by the build tool (by installing in a unique path so that the unique hash being used can be used at post installation to relocate correctly in an automatic manner).
  • Collecting log files should be done at a different level, IMHO. What the "build protocol" could provide is a callback which gets invoked with generated log files. Something like HEP_LOG_CALLBACK which gets invoked by ./build.sh pointing to the actual build log for processing.
  • As I said, there should not be explicit dependencies but the build script should extend behaviour based on the dependencies being provided.

@ktf
Copy link
Contributor

ktf commented Mar 18, 2015

@sbinet I think relocability should be mandatory. We (CMS) have as a requirement to be able to build and run from any path, as non root.

@hegner
Copy link
Member Author

hegner commented Mar 18, 2015

@ktf build type indeed meant debug or optimised. What concerns location of the source. While Pere and others will disagree with that approach, I actually meant a place in the local file system. How it ends up at that place is indeed up to a different tool or step.

@ktf
Copy link
Contributor

ktf commented Mar 18, 2015

@hegner place in the filesystem would be ok for me as well.

@brettviren
Copy link
Member

I think we should understand a procedure for directing this work before jumping in to solutions.

One persons "packaging" is another persons muck and mire. We should define what problems we are trying to solve more clearly before talking about solutions. We should figure out a metric to measure proposed solutions before proposing solutions. We should collect existing solutions and measure them against this metric before contemplating writing something new.

GitHub issue threads are kind of lousy for this type of discussion. Would it not be better to have it on a mailing list?

@ktf
Copy link
Contributor

ktf commented Apr 3, 2015

I disagree. That way we will end up with something nobody wants, apart from maybe the one who managed to impose his solution. It has happened in the past, already.

Experiments already have their tools and for each one of them their tool scores 100% in the only metric which matters: "works for me and requires no extra manpower", otherwise we would not have this discussion.

IMHO, for this to succeed, the various tools should say what they are willing to "give up" for the sake of commonality and common effort. As I told @hegner at last meeting, while I've strong opinion on how sources are referred and distributed, I do not have those for everything that happens from "sources expanded in a single directory" to "files installed with make install". Given that's the 95% of the maintenance effort we are currently putting in this kind of things in CMS, I think it would be extremely useful to have an agreed way of doing this, rather than trying to focus on agreeing on things we know we disagree. I would personally move to it the day after if I had something which I invoke with:

build.sh -DPATH_TO_MY_SOURCES=/path -DPREFIX=/install

and lets me worry about everything else.

@brettviren
Copy link
Member

Well, your build.sh example is just a high level automation interface and doesn't really impart any requirements.

What I'm trying to do is exactly follow a process where imposition of singular will is not emphasized. I don't see how this means we are trying to agree on things we disagree. If we can't come up with a common understanding of the scope of "packaging" then there's no point in collaborating.

So, I'd like to define what we are trying to accomplish instead of have everyone make they're own assumptions about what problem we solve based on whatever things they are facing. Maybe ones pet problem can be identified as a subset or aspect of the entire problem and then people can warm up to contributing toward that part.

I see this problem factoring into these major areas:

  1. Build configuration and release management. Ways to define a suite of software, specify the configuration for building including things like package versions, build variants, target platforms.

  2. Packaging. Forms of bundling the results of the build in a way that they may be transfered to user machine.

  3. Installation and user setup. How to apply the packages and make them ready to use. This is tied to (2), of course.

And these are likely incomplete as they are based on how I view the problem. So, I don't mean to say this is the best separation, but if we can't write down and agree on some kind of breakdown and clarification of the goal like this then I don't know how we can work together towards a common solution.

Some general desired features:

  • automation (hands-off build / installation, complex actions hidden behind simple interfaces)
  • modularity between layers (don't tie user run-time setup into build-time environment)
  • flexibility to assert domain-specific policy (eg, how files are layed out, maybe I want to use a different binary package than you, maybe you want a different run-time setup system than me)
  • concise configuration (I want to tell my users: "take this single file, it's all you need to build everything to reproduce Run X Analysis")
  • efficiency of reuse (I don't want to rebuild Package-X for Release-B if it was already built and available from Release-A, my users normally do not want to build if they can install and they don't want to install if they can just setup)
  • don't assume all users are dumb (users sometimes want and must rebuild the entire stack in a way that they can tweak the build variant, hack on ROOT, swap out versions of Python, add print statements to Geant4)

I hope none of these requirements are controversial (they are certainly incomplete) and lead to us trying to "agree on things we disagree". But, without stating them, how can we expect to satisfy them?

A lot of work has gone on in this arena already both in our community and outside. If we do not evaluate that work in in light of some kind of criteria then we are setting ourselves up to reinvent the wheel, yet again (and probably poorly).

The things on my personal radar include and I'd like to know what other ones I might be missing:

  • homebrew looks interesting
  • conda looks interesting
  • guix/nix looks interesting
  • CMT is still interesting
  • Sebastian's hwaf is very interesting
  • Of course, my own worch is personally interesting

I know right now that none of these, as-is, are 100% suitable to satisfy the requirements in my head and so are less likely to satisfy all requirements we can come up with. But some of them are partly suitable, some of them can be changed to be made more suitable and some of them have incredibly good concepts that can, at the very least, be taken and incorporated in new development we may perform.

So, again, I think we need to produce a document which defines roles, use cases, states what types of functionality we require, critically reviews the existing systems, ranks them against our requirements and then tries to determine what the best approach is going forward, be it adoption, adoption+improvement, whole-cloth reinvention or some aggregation of these. I'm willing to start such a document (kept as latex in an open HSF github repo). To be successful it will need input from many people in various roles across HEP.

@sextonkennedy
Copy link
Member

@brettviren I am also interested in having a clear document with the elements you describe. Reading the above, I see lots of agreement, so it's clear to me that we're making progress. As for format I'd be willing to use Latex but as I'm looking at a button right now that says "Markdown supported", I'd have a preference for that ;-)

@ktf
Copy link
Contributor

ktf commented Apr 3, 2015

Let me see if I can find the ones which were already written with exactly the same goals in the 3 years ago, I mean 6 years ago, I mean 10 years ago... ;-) I remember exactly the same discussion about CMT and SCRAM almost 13 years ago using exactly this approach. I think we should at least try to avoid mistakes of the past.

I personally think we should start from picking up the most widely used tools in HEP for the packaging business, e.g. lcgBuild (@hegner what's the name of your tool?) and cmsBuild and try to iron out unneeded differences. My "build.sh" simplification is a clear point where we are spending a lot of effort, platform support, which could be unified and for which I'm personally more that happy to pickup whatever @hegner proposes as an implementation.

Same for the building part. Given there is general consensus CMake is the best tool for building small / medium projects, why do we still have so many generators, fastjet and many other bits which use configure / make? Shouldn't HSF direct effort to fix that rather than at writing yet another nicely typeset design document for yet "one tool to rule them all"?

I hate finding myself citing the Agile Manifesto, but IMHO there is some truth in "Customer collaboration over contract negotiation".

Clearly this discussion diverged already...;) See you in Japan, I need to finish my slides...;)

@drbenmorgan
Copy link
Member

@brettviren 👍 I think this thread already demonstrates several slightly different viewpoints/understandings of "packaging".

@ktf O.k., where are lcgBuild and cmsBuild documented and obtainable from? How would I start using them to provide a software stack for a new experiment?

Regarding use of CMake, I believe HSF should promote the use of (or migration to) standard build tools, but this is orthogonal to the "packaging" problem. My perspective on the latter is that it is a layer above, and decoupled from, the "build tool" layer(s). RPM et al follow this pattern, and so don't care whether an individual package uses CMake, Autotools, or even a hand written Makefile. Decoupling also means that the "build tool" must not depend on the "packaging" layer - if this isn't done, portability, re-usability and adaptability suffer.

@ktf
Copy link
Contributor

ktf commented Apr 14, 2015

@hegner To be clear, my point is about writing a design document not
documentation in general. Obviously "how to deploy" documentation should be
there, or at least an example.

Even in that case, though, that does not solve the 95% of the problem which
is maintaining the software stack. Unless we manage to agree on how to do
that, there will be little gain from a common tool. To make a clear
example: I could port Alice (just to use a random example ;-) ) stack in a
couple of weeks, using cmsBuild and all the tools I've written and I
perfectly know and happy about (so no documentation / design issues there).
However even in that case, where I've everything under control, unless
there was for me some way of leveraging from the builds recipes of someone
else I would still pay 95% of the cost of of that effort, and more
importantly I would still pay the ~ 1FTE equivalent ongoing effort required
to maintain, port and adapt the software stack. That's where the cost goes
and what should be fixed first, IMHO.
Using any tools, regardless of how sophisticated, will have that problem
unless there is a common way of handling build recipes.

On Mon, Apr 13, 2015 at 6:05 PM Ben Morgan [email protected] wrote:

@brettviren https://github.com/brettviren [image: 👍] I think this
thread already demonstrates several slightly different
viewpoints/understandings of "packaging".

@ktf https://github.com/ktf O.k., where are lcgBuild and cmsBuild
documented and obtainable from? How would I start using them to provide a
software stack for a new experiment?

Regarding use of CMake, I believe HSF should promote the use of (or
migration to) standard build tools, but this is orthogonal to the
"packaging" problem. My perspective on the latter is that it is a layer
above, and decoupled from, the "build tool" layer(s). RPM et al follow this
pattern, and so don't care whether an individual package uses CMake,
Autotools, or even a hand written Makefile. Decoupling also means that the
"build tool" must not depend on the "packaging" layer - if this isn't done,
portability, re-usability and adaptability suffer.


Reply to this email directly or view it on GitHub
#1 (comment).

@drbenmorgan
Copy link
Member

@brettviren @sextonkennedy I'd like to start putting some notes together to contribute to the discussion as outlined by @brettviren. Just wondered what layout of files inside the (this?) repo should be used together with push/pull access. I can fork/work/pull request if that's easiest in the latter case. Regarding formats, I'd prefer Markdown.

@brettviren
Copy link
Member

@drbenmorgan, I'd suggest just go for it and start putting some thoughts down. I would like to do the same but lately DUNE stuff is taking all my time. If others want to do similarly then we can exchange our writeups, have some time to digest them and then have a phone call. I'd suggest that until we have some consensus on how this packaging group will go (and what "packaging" actually means to everyone) we should keep this repository empty.

@sextonkennedy
Copy link
Member

@drbenmorganhttps://github.com/drbenmorgan,@brettvirenhttps://github.com/brettviren I agree I’d like to see people write up their thoughts, and I hope you will in whatever format you choose (I personally like Markdown too…) I don’t think it would hurt to use the repo for exchanging these. We could create a /whitepaper subdir so that it will be obvious that this was input to the consensus building exercise, and this way people can use a real CMS if they decide to modify theirs. Agreed?

Cheers, Liz

On Apr 22, 2015, at 1:59 PM, Brett Viren <[email protected]mailto:[email protected]> wrote:

@drbenmorganhttps://github.com/drbenmorgan, I'd suggest just go for it and start putting some thoughts down. I would like to do the same but lately DUNE stuff is taking all my time. If others want to do similarly then we can exchange our writeups, have some time to digest them and then have a phone call. I'd suggest that until we have some consensus on how this packaging group will go (and what "packaging" actually means to everyone) we should keep this repository empty.


Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-95303725.

@hegner
Copy link
Member Author

hegner commented Apr 23, 2015

Hi all! I fully agree here. Let's get some thoughts written down in some simple markdown.
This common confusing about what "packaging", "building" etc actually means here we could resolve quickly when we had a mostly CERN internal discussion on it. For the others - here the presentations of back then:
https://indico.cern.ch/event/373973/
and here the minutes:
https://indico.cern.ch/event/373973/material/minutes/minutes.html
Please give it a read. This lead to this effort here of just trying out the ideas we had. Which now got this useful discussion started :-)

@ktf
Copy link
Contributor

ktf commented Apr 27, 2015 via email

@ktf
Copy link
Contributor

ktf commented Apr 27, 2015

@brettviren I just read your comment about leaving this repository empty after committing the file. Sorry about that. Maybe we should have a "rfc" folder for stuff we want to show / share to others without being "official"?

@brettviren
Copy link
Member

No worries. It was just a suggestion.

@sextonkennedy
Copy link
Member

Hi All,

I’ve talked to a number of you one on one, and we agree that it is time to collect up all of the information submitted to the group so far, and discuss a direction that we’re all comfortable with. So far Giulio is the only one who has posted a white-paper. I encourage others to post theirs as well. I’ve created a doodle poll to try to find a suitable time (it is timezone enabled):
http://doodle.com/c934qas5gcqcnncp

Please fill it out this week and I’ll send out a meeting announcement this Fri.

Cheers, Liz

@sextonkennedy
Copy link
Member

Hi All,

According to the poll, Wed. the 20th at 11am. is the best time for this meeting. Please mark your calendars. I’ll set up an Indico next week, and I’ll have some slides to introduce the discussion. Please let me know if you would like to present something.
I would still like to see some more contribution of ideas and/or documentation on packaging systems you have worked on or particularly like.

Cheers, Liz

On May 11, 2015, at 8:28 AM, Elizabeth Sexton-Kennedy <[email protected]mailto:[email protected]> wrote:

Hi All,

I’ve talked to a number of you one on one, and we agree that it is time to collect up all of the information submitted to the group so far, and discuss a direction that we’re all comfortable with. So far Giulio is the only one who has posted a white-paper. I encourage others to post theirs as well. I’ve created a doodle poll to try to find a suitable time (it is timezone enabled):
http://doodle.com/c934qas5gcqcnncp

Please fill it out this week and I’ll send out a meeting announcement this Fri.

Cheers, Liz

@drbenmorgan
Copy link
Member

@sextonkennedy Pull request with my notes #1!

If there is interest, I can say something briefly on experiments using Homebrew/Linuxbrew for packaging Art (for @DUNE) and @SuperNEMO-DBD software:

@sextonkennedy
Copy link
Member

Hi Ben,

Thanks for this. I think it looks interesting and when I looked at the
homebrew-dunebrewhttps://github.com/drbenmorgan/homebrew-dunebrew/root.rb
it looks like it contains one of these recipe’s or protocols for building root. So in that sense it is an example. What I want to start with tomorrow is a discussion of how do we move forward and make progress. Maybe the way forward is to have people present the work they’ve been doing or their experiments solution for discussion. Please prepare something and we’ll see tomorrow if people like this approach.

Cheers, Liz

On May 18, 2015, at 6:42 PM, Ben Morgan <[email protected]mailto:[email protected]> wrote:

@sextonkennedyhttps://github.com/sextonkennedy Pull request with my notes #1#1!

If there is interest, I can say something briefly on experiments using Homebrewhttps://github.com/Homebrew/homebrew/Linuxbrewhttps://github.com/Homebrew/linuxbrew for packaging Art (for @DUNEhttps://github.com/DUNE) and @SuperNEMO-DBDhttps://github.com/SuperNEMO-DBD software:


Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-103238467.

@drbenmorgan
Copy link
Member

Hi Liz,

O.k., I’ll put a couple of slides together on using Homebrew, plus a couple on this “build protocol” issue.

Cheers,

Ben.

On 19 May 2015, at 10:34, sextonkennedy <[email protected]mailto:[email protected]> wrote:

Hi Ben,

Thanks for this. I think it looks interesting and when I looked at the
homebrew-dunebrewhttps://github.com/drbenmorgan/homebrew-dunebrew/root.rb
it looks like it contains one of these recipe’s or protocols for building root. So in that sense it is an example. What I want to start with tomorrow is a discussion of how do we move forward and make progress. Maybe the way forward is to have people present the work they’ve been doing or their experiments solution for discussion. Please prepare something and we’ll see tomorrow if people like this approach.

Cheers, Liz

On May 18, 2015, at 6:42 PM, Ben Morgan <[email protected]mailto:[email protected]mailto:[email protected]> wrote:

@sextonkennedyhttps://github.com/sextonkennedy Pull request with my notes #1#1!

If there is interest, I can say something briefly on experiments using Homebrewhttps://github.com/Homebrew/homebrew/Linuxbrewhttps://github.com/Homebrew/linuxbrew for packaging Art (for @DUNEhttps://github.com/DUNE) and @SuperNEMO-DBDhttps://github.com/SuperNEMO-DBD software:


Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-103238467.


Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-103417141.

Dr. Ben Morgan
Senior Research Fellow
Department of Physics
University of Warwick

Coventry CV4 7AL

@brettviren
Copy link
Member

Although I'm in HEP-SF, I can't apparently push this repo. I just make a pull request with some hurried thoughts. Please someone merge it. Sorry I didn't stick to .md as Org is far easier for me.

The thoughts are not as carefully written as I'd like but I tried to partly accomplish two things:

  • provide what I think are important separation of roles involved in the thing called "packaging"
  • begin a (still incomplete) list of major tools I think are needed and some that I know of that exist now.

I also added a summary of a tool I wrote (Worch) which I think is useful in this space.

@brettviren
Copy link
Member

In reading this:

https://github.com/HEP-SF/packaging/blob/master/ktf_Packaging_ideas.md

I see an obvious dichotomy of design: whether or not to keep HEP/experiment-specific information within the forked source package. I expect this to be a source of discussion as we go forward.

While I very much like the idea of providing a central and reliable copy of the source code which is independent from what the various upstream developers provide, I do not like the idea of forking it in order to add non-upstream files specific to one build system.

  • it adds extra work and "noise" when trying to get change accepted upstream
  • it spreads around copies of repeated code requiring another system to be invented to keep them in sync
  • it tends to lock-in build methods and makes porting more difficult
  • fixes/improvements in one area or package tend not to get adopted by all

With Worch I take a concentrated approach. The many various and distributed "build.sh" scripts have analogues as Waf tools, either included directly in Worch or as a Worch add-on package (like worch-ups or dune-build/lbne-build).

@brettviren
Copy link
Member

Sorry, one last note. If you just want to read my contribution it's here:

https://github.com/brettviren/packaging/blob/master/bv-packaging-views.org

@jouvin
Copy link

jouvin commented May 19, 2015

@brettviren this is intentional not to allow direct push to hep-sf repos... PR is the way to go..

@sextonkennedy
Copy link
Member

Hi All,

I have created an Indico agenda for the meeting tomorrow:

https://indico.cern.ch/event/395449/

The vidyo connection information is available from the agenda.

Cheers, Liz

On May 16, 2015, at 7:26 AM, Elizabeth Sexton-Kennedy <[email protected]mailto:[email protected]> wrote:

Hi All,

According to the poll, Wed. the 20th at 11am. is the best time for this meeting. Please mark your calendars. I’ll set up an Indico next week, and I’ll have some slides to introduce the discussion. Please let me know if you would like to present something.
I would still like to see some more contribution of ideas and/or documentation on packaging systems you have worked on or particularly like.

Cheers, Liz

On May 11, 2015, at 8:28 AM, Elizabeth Sexton-Kennedy <[email protected]mailto:[email protected]> wrote:

Hi All,

I’ve talked to a number of you one on one, and we agree that it is time to collect up all of the information submitted to the group so far, and discuss a direction that we’re all comfortable with. So far Giulio is the only one who has posted a white-paper. I encourage others to post theirs as well. I’ve created a doodle poll to try to find a suitable time (it is timezone enabled):
http://doodle.com/c934qas5gcqcnncp

Please fill it out this week and I’ll send out a meeting announcement this Fri.

Cheers, Liz

@ktf
Copy link
Contributor

ktf commented May 19, 2015

@brettviren if dropping the build recipes from the build sources is a way to compromise, I'm happy to follow that way. In the end as you say the most important thing is to have a Github-like repository with the sources and the HSF / experiment specific fixes so that we have some chance of properly viewing them and maybe avoid reinventing the wheel.

@ktf
Copy link
Contributor

ktf commented May 19, 2015

@sextonkenney: You said 11am in the previous email, while the agenda
says 18:00. Which one is the correct one (I assume the latter, but just
in case)?

On 19 May 2015, at 17:50, sextonkennedy wrote:

Hi All,

I have created an Indico agenda for the meeting tomorrow:

https://indico.cern.ch/event/395449/

The vidyo connection information is available from the agenda.

Cheers, Liz

On May 16, 2015, at 7:26 AM, Elizabeth Sexton-Kennedy
<[email protected]mailto:[email protected]> wrote:

Hi All,

According to the poll, Wed. the 20th at 11am. is the best time for
this meeting. Please mark your calendars. I’ll set up an Indico
next week, and I’ll have some slides to introduce the discussion.
Please let me know if you would like to present something.
I would still like to see some more contribution of ideas and/or
documentation on packaging systems you have worked on or particularly
like.

Cheers, Liz

On May 11, 2015, at 8:28 AM, Elizabeth Sexton-Kennedy
<[email protected]mailto:[email protected]> wrote:

Hi All,

I’ve talked to a number of you one on one, and we agree that it is
time to collect up all of the information submitted to the group so
far, and discuss a direction that we’re all comfortable with. So
far Giulio is the only one who has posted a white-paper. I encourage
others to post theirs as well. I’ve created a doodle poll to try to
find a suitable time (it is timezone enabled):
http://doodle.com/c934qas5gcqcnncp

Please fill it out this week and I’ll send out a meeting
announcement this Fri.

Cheers, Liz


Reply to this email directly or view it on GitHub:
#1 (comment)

@sextonkennedy
Copy link
Member

Hi Brett and Giulio,

I’m asking this question because I want to make sure I understand what @brettviren is proposing. Are you proposing that the "build.sh” scripts or Waf tools, are curated within the repository for the master packaging-integration-building tool, rather then with the source code for the package?

Cheers, Liz

On May 19, 2015, at 2:30 PM, Giulio Eulisse <[email protected]mailto:[email protected]> wrote:

@brettvirenhttps://github.com/brettviren if dropping the build recipes from the build sources is a way to compromise, I'm happy to follow that way. In the end as you say the most important thing is to have a Github-like repository with the sources and the HSF / experiment specific fixes so that we have some chance of properly viewing them and maybe avoid reinventing the wheel.


Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-103624860.

@brettviren
Copy link
Member

Hi Liz. Neither, or maybe both.

I'm definitely saying that we should NOT add build related files to otherwise pristine forks (or forks which are maybe pristine but with a few local bug fixes). It sounds like this is not a totally objectionable view which is great to hear.

I'd like whatever build orchestration system to keep per-package support scripts/files in some "other" place. At a minimum, this might be some central repository or maybe group of repositories (one per package). This is the homebrew strategy, I believe.

I think careful factoring of build method and build configuration is needed as is some understanding of how that can be done in the context of different experiments with some overlapping requirements and some conflicting ones. With Worch, I have taken a hierarchical approach to where such files "live" and I think it provides for this factoring.

Worch comes directly with support for the most common build methods. It then provides a means for the user to aggregate novel ones by providing dependent (Python) packages. Since this aggregation is through the standard Python module system one can exploit standard Python packaging "ecosystem" to allow the build environment to be bootstrapped easily and automatically, even if one happens to require many Worch extensions. I have found that exploiting PyPI is particularly useful here.

As an example, there is https://github.com/brettviren/worch-ups which provides support for creating UPS products from an otherwise "standard" installations. This is just another Python package that depends on Worch and provides the extra Waf tools needed to enact the UPS-related build methods. Going up one more layer there is https://github.com/DUNE/lbne-build (I still need to change its name to "dune-build") which depends on both worch and worch-ups via the usual Python packaging mechanism. It happens to provide support for building Pandora and tbb as well as housing the build configuration files for the DUNE/LBNE LArSoft (portable-CMake version) software suite. Both are distributed via PyPI so one just needs to do:

pip install lbne-build

To bootstrap.

The configuration files can also be factored out into their own package, and in this case maybe should be. This would allow DUNE to have it's own build configuration package while allowing other experiments with similar software suites (eg, any LArSoft-based ones) to benefit from sharing the special build methods that are needed. The experiment would then focus on making releases by developing the build configuration package and then some other group maybe made up of folks in Fermilab computing division (just for example) could maintain a shared Worch package housing whatever extra Waf tools may be needed. Wider collaboration is also possible, for example maybe one day LSST wants to use worch-ups to provide EUPS packages.

So, while I think Worch is ideal for this, what I hope is that we can create structures that allow for this kind of shared build infrastructure while still allowing individual customization where it is needed.

@sextonkennedy
Copy link
Member

Hi Giulio,

I should have been more careful. The time is 18:00 CET / 11:00 CDT (FNAL) / 12:00 EDT / 9:00 CDT

Thanks, Liz

On May 19, 2015, at 2:44 PM, Giulio Eulisse <[email protected]mailto:[email protected]> wrote:

@sextonkenney: You said 11am in the previous email, while the agenda
says 18:00. Which one is the correct one (I assume the latter, but just
in case)?

On 19 May 2015, at 17:50, sextonkennedy wrote:

Hi All,

I have created an Indico agenda for the meeting tomorrow:

https://indico.cern.ch/event/395449/

The vidyo connection information is available from the agenda.

Cheers, Liz

On May 16, 2015, at 7:26 AM, Elizabeth Sexton-Kennedy
<[email protected]mailto:[email protected]mailto:[email protected]> wrote:

Hi All,

According to the poll, Wed. the 20th at 11am. is the best time for
this meeting. Please mark your calendars. I’ll set up an Indico
next week, and I’ll have some slides to introduce the discussion.
Please let me know if you would like to present something.
I would still like to see some more contribution of ideas and/or
documentation on packaging systems you have worked on or particularly
like.

Cheers, Liz

On May 11, 2015, at 8:28 AM, Elizabeth Sexton-Kennedy
<[email protected]mailto:[email protected]mailto:[email protected]> wrote:

Hi All,

I’ve talked to a number of you one on one, and we agree that it is
time to collect up all of the information submitted to the group so
far, and discuss a direction that we’re all comfortable with. So
far Giulio is the only one who has posted a white-paper. I encourage
others to post theirs as well. I’ve created a doodle poll to try to
find a suitable time (it is timezone enabled):
http://doodle.com/c934qas5gcqcnncp

Please fill it out this week and I’ll send out a meeting
announcement this Fri.

Cheers, Liz


Reply to this email directly or view it on GitHub:
#1 (comment)


Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-103627997.

@sextonkennedy
Copy link
Member

We now have 3 pull requests, Brett’s, Ben’s, and one from me that you can read here:
https://github.com/sextonkennedy/packaging/blob/master/white_papers/fnal_requirements_cover.md
https://github.com/sextonkennedy/packaging/blob/master/white_papers/fnal-build-mgmt-requirements.pdf

Cheers, Liz

On May 19, 2015, at 10:36 AM, Brett Viren [email protected] wrote:

Sorry, one last note. If you just want to read my contribution it's here:

https://github.com/brettviren/packaging/blob/master/bv-packaging-views.org


Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-103526655.

@drbenmorgan
Copy link
Member

I agree with @brettviren on keeping "packaging recipes" separate from the pristine/patched project sources. These packaging recipes, RPM specfiles, Homebrew Formulae etc, are the "build protocol" as
they call the project's build script(s) (Make, CMake etc) and supply them with the needed info.

Just to quickly comment on the Homebrew way of storing 'recipes' (Formulae in brew-speak), a whole set are supplied with an install of brew, and are stored in `Library/Formula':

https://github.com/Homebrew/homebrew/tree/master/Library/Formula

For example:

https://github.com/Homebrew/homebrew/tree/master/Library/Formula/clhep.rb

Additional git repos holding Formulae, "Taps", in brew-speak, can be created, e.g.

https://github.com/Homebrew/homebrew-science

Those are added to a brew install by doing

$ brew tap homebrew/science
$ brew install root6

However, it should be noted that Taps can't be prioritised, as in yum for example. If a Tap supplies an updated Formula already in the mainline repo, brew will always use the mainline version unless the Formula name is fully qualified with the Tap name. However, adding this feature is on the Homebrew radar:

https://github.com/github/gsoc#homebrew

@ktf
Copy link
Contributor

ktf commented May 19, 2015

Well, there is plenty of other perfectly valid projects, e.g. those based on travis, python distutils, maven, Docker, not too mention properly done "configure && make && make install" where build recipes are inside the project themselves. IMHO, having them separate is just moving the issue around: you move the intelligence for picking up the correct build recipe, from picking up the correct version of an external to picking up the correct version of a configuration which you think has the version of the external you need. I personally find the logic:

git clone -b hsf/version-i-want-for-external external
cd external
./build.sh

much easier (and standalone) to handle than:

git clone -b correct-version-of-my-tool my-tool
git clone -b version-of-configuration-i-think-contains-my-external my-configuration
URL=`parse-configuration-to-find-url my-configuration/external-recipe`
get-sources-given-url $URL $BUILDDIR
my-tool/buildTool -c my-configuration/external-recipe $BUILDDIR 

which is what most of us do in our tools (including cmsBuild, so it's not like I did not thought it was a good idea 9 years ago) right now. Stuff like homebrew, koji, macports work that way because they have basically one of the following two assumptions true:

  1. there is one configuration to worry about. Which is not true for our case.
  2. they were thought in a (pre-git) time in which maintaining sources and patches on top was pain. Thanks god (and Linus) we have git (and github) now.
    If 1 and 2 do not hold, having build recipes separate does not simplify the problem, it simply moves using an extra layer of indirection where that problem is.

That said, again, if having tool independent build recipes separate helps going forward, let's keep them separate.

@peremato
Copy link
Member

Hi Liz,

I should have been more careful. The time is 18:00 CET / 11:00 CDT (FNAL) / 12:00 EDT / 9:00 CDT

I have a problem them. I was convinced that was at 11am CET. I have scheduled a concurrency forum meeting at 17:00 CET that may be over most probably.
Cheers,

    Pere

@brettviren
Copy link
Member

@ktf Just to be clear, I'm not suggesting to remove or ignore whatever native, package-level build mechanism is provided by the upstream developers. Quite the opposite, the main strategy of Worch is to use that native build mechanism while providing a layer of "putty" over all the bumps caused by the fact that each provides different package-level build methods. This "putty" layer in then present a smooth interface to the higher layer automation and configuration layers in Worch.

So, all of the nitty gritty details you show in your examples, whatever may be needed, I want them to be ferreted away behind/inside Worch/Waf tools and nominally not exposed to the end user.

Also, in reference to Ben's approach, I want to mention that using homebrew, either directly or by just taking its concepts, is not inconsistent with using Worch for the global configuration and automation layers. In fact, Worch on top of homebrew would be rather easy to implement (in terms of Worch) as it would be homebrew that provides the difficult and detailed "putty" layer and Worch would be able to "talk" to that putty layer via a single Waf tool (eg, provided by a "worch-brew" package). In fact, this approach might be maximally beneficial as it would provide a well known/documented/supported interface at the "putty" layer which would be useful even on its own. It would also allow alternatives to be developed for the high-level configuration/automation layers for people that didn't want to use Worch for whatever reasons.

What remains to understand (in my mind) with this approach is if homebrew can produce packages in forms that we all need. I think it can but I don't have a lot of experience with it nor do I know what forms everyone expects. Defining these forms is clearly something we must do.

@drbenmorgan
Copy link
Member

@ktf O.k., I think I understand better now - let me try and translate this to a partial brew-based example using the zlib case from your notes.

The root directory would have files

$ ls
configure    ....    zlib.rb
$ git status
On branch cms/v1.2.7

Here the zlib.rb is semi-equivalent to the build.sh (which I'm ignoring temporarily) and has contents

class Zlib < Formula
  url File.dirname(__FILE__), :using => :git, :tag => "v1.2.7"
  head File.dirname(__FILE__), :using => :git, :branch => "cms/v1.2.7"

  def install
    ENV["CFLAGS"] = "-some -list"
    system "./configure", "--prefix=#{prefix}"
    system "make", "install"
  end 
end

In a homebrew install, this git repo could in principle be added as a Tap:

$ brew tap cms-externals/zlib/
$ brew install zlib
... Would install lib from v1.2.7 tag...

Homebrew's limitation at the moment is that there's no direct support for tapping branches, though it could be added, see the any-tap, err, tap (Brett's worch might also help here).

It also ignores, for simplicity, Arch/compiler specific CFLAGS other than to illustrate where they'd be set. Homebrew treats that as a higher level "policy" (IIRC CMT does something along these lines too) handled through Superenv

Going back to the build.sh the zlib.rb could call this directly instead, or, as I think is the suggestion,
zlib.rb would move to a dedicated tap and be:

class Zlib < Formula
  url "https://github.com/cms-externals/zlib.git", :tag => "v1.2.7"
  version "1.2.7"

  def install
    # Somehow ensure Homebrew sets all the hidden "HEP" variables correctly
    # Gets worse if dependencies come in, but maybe SuperEnv helps
    system "./build.sh"
 end

@drbenmorgan
Copy link
Member

What remains to understand (in my mind) with this approach is if homebrew can produce packages in forms that we all need. I think it can but I don't have a lot of experience with it nor do I know what forms everyone expects. Defining these forms is clearly something we must do.

I think that'd be very useful - by 'form' do you mean filesystem layout of installed files and/or metadata (and binary package formats)?

My knowledge of Homebrew's binary 'bottles' is limited, but on disk, packages are installed in the structure, using Boost as a simple example

+- HOMEBREW_PREFIX/
    +- bin/
    +- include/
    +- lib/
    +- Cellar/
        +- boost/
            +- 1.57.0/
             |   +- INSTALL_RECEIPT.json
             |   +- include/
             |   +- lib/
            +- 1.58.0/
                 +- INSTALL_RECEIPT.json
                 +- include/
                 +- lib/

The INSTALL_RECEIPT.json file looks like:

{
  "used_options":["--c++11","--without-single","--without-static"],
  "unused_options":["--universal","--with-icu4c","--with-mpi"],
  "built_as_bottle":false,
  "poured_from_bottle":false,
  "tapped_from":"Homebrew/homebrew",
  "time":1420473509,
  "HEAD":"570ce2d7b3a5a39c18d443c49a764cbd1fe8a624",
  "stdlib":"libcxx",
  "compiler":"clang"
}

So a certain amount of configuration info there...

@sextonkennedy
Copy link
Member

Hi All,

When I asked for a future meeting at last week’s meeting, I didn’t remember that this week was a short week on both sides of the ocean. Since it is I’ve shifted the suggested times for meetings on the doodle poll with most of the options next week. Please sign up here:

http://doodle.com/thbi6sybp63akggm

Also I’m using both the github issue, and the google group this time. However at the last meeting we agreed to move to the google group. You can do this by sending an email to:

[email protected]:[email protected]

or sign up on the web page.

Cheers, Liz

@sextonkennedy
Copy link
Member

Hi All,

The most available time in the doodle poll with everyone saying yes or if-need-be is on Tues. Since I also have to give more weight to Brett’s preference (remember that he will be presenting his ideas at this meeting), I’ve selected the 16:00 CERN/ 8:00 BNL/ 9:00 FNAL meeting time on Tues.  I will send out an indico agenda this weekend.

Thanks for participating, Liz

@sextonkennedy
Copy link
Member

Hi All,

I’ve created an agenda for our second meeting in which we’ll hear from Brett:
https://indico.cern.ch/event/398344/

The vidyo connect is available from the agenda page but here is also the direct link:
http://vidyoportal.cern.ch/flex.html?roomdirect.html&key=WZ3KuM6KYbLL

Cheers, Liz

On May 28, 2015, at 3:33 PM, Elizabeth Sexton-Kennedy <[email protected]mailto:[email protected]> wrote:

Hi All,

The most available time in the doodle poll with everyone saying yes or if-need-be is on Tues. Since I also have to give more weight to Brett’s preference (remember that he will be presenting his ideas at this meeting), I’ve selected the 16:00 CERN/ 8:00 BNL/ 9:00 FNAL meeting time on Tues. I will send out an indico agenda this weekend.

Thanks for participating, Liz

@sextonkennedy
Copy link
Member

Elizabeth S Sexton-Kennedy [email protected] writes:

https://indico.cern.ch/event/398344/

Thanks. I uploaded the PDF for the presentation.

Just to be clear, while the agenda says "DUNE packaging ideas" I will
really present a "Worch Overview". There is nothing DUNE-specific.

Talk with y'all tomorrow.

-Brett.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants