Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Permutation Groupoid #707

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ssteakley
Copy link
Contributor

This is a chunk of the work I've been doing toward implementation of #231. The main result here is the implementation of FinBijectionCycles: a concrete type for finite automorphisms that are backed by a set of cycles. Cycles are implemented as CircularVectors, from CircularArrays.jl, which are wrapped so that we can provide appropriate implementations of == and hash.

The commit also introduces an abstract type SetBijection <: SetFunction and redeclares IdentityFunction <: SetBijection.

Potential Issues:

  • Calculation of function values for FinBijectionCycles, as implemented here, is meaningfully slower than for vector-backed and dict-backed FinFunctions. If the underlying data of the function is truly going to be a set of cycles, then I'm not sure how the speed can be improved. It seems that some transformation would need to be computed.
  • For f :: FinBijectionCycles, show(f) doesn't look great.
  • This WIP was largely inspired by the preexisting implementations of the FinFunction types. Maybe this has been somewhat preempted, or superceded, by other ongoing work. Namely, the refactor of FinFunctions to use Columns.

Feedback would be greatly appreciated!

@ssteakley
Copy link
Contributor Author

ssteakley commented Jul 2, 2023

After some more work on this, trying to speed up composition for FinBijectionCycles, I now feel that this approach may be inherently too slow. Representing a bijection by a set of Cycles matches up intuitively with the mathematical cycle notation, but it is an unwieldy representation for computing function values and composites. This PR's algorithm for computing values, for example, is worst case O(n) for n the size of the domain. I wonder, would there be other advantages to this representation that would make it worth these instances of bad performance?

One alternative would be to store the mapping in a more usual way, either as a Vector or Dict, and then store the cycles themselves as metadata. Essentially, the mapping would be a wrapper around a FinBijectionDict or FinBijectionVector with some metadata. In the interest of performance, composition of two FinBijectionCycles probably should not return another FinBijectionCycles, since this would require the metadata (the new set of cycles, themselves) to be recomputed from scratch.

@ssteakley
Copy link
Contributor Author

More and more I am thinking that implementing Bijections as wrapper types, with SetFunctions inside, is what makes the most sense. For me, this goes together with the insight that, for most purposes, if you are going to compute the cycles of a FinBijection, you may as well compute and store its inverse at the same time. Both require computing the same data in a loop over the mapping's domain. This suggests that the implementation of FinBijectionCycles could profitably be based on that of a type BijectionBimap, which would represent a bijection as a pair of (inverse) functions. Computing the inverse along with the cycles is, of course, a little slower, and takes more space. But as a consequence, the resulting object is perhaps more faithful to mathematical cycle notation: indeed, a written set of cycles displays the inverse mapping "alongside" the mapping itself, truly on equal footing.

@ssteakley
Copy link
Contributor Author

ssteakley commented Jul 10, 2023

I have reimplemented FinBijectionCycles based on a system of wrapper types which represent bijections. Now, a FinBijectionCycles object is just a pair of inverse SetFunctions (i.e. a BijectionBimap) with the cycles themselves attached as ancillary data. Some odds and ends are still missing (e.g. an appropriate show method for FinBijectionCycles), but the code is passing tests for basic functionality.

As I explain in a # developer's note in src/CategoricalAlgebra/Sets.jl, the extant implementation of CompositeFunction is disadvantageous for my implementation of composition with inverses. The crucial fact is that iterated composites result in a tree-like structure of nested CompositeFunctions, and the type information of the "root" of the tree does not convey the types of the leftmost and rightmost functions of the composite. In my implementation, this leads to an inevitable efficiency loss when using compose with any CompositeFunction. The obvious idea for a way forward would be to try refactoring CompositeFunction to store a list of SetFunctions instead of a pair, thus eliminating the need to form nested CompositeFunctions and allowing the type to expose the types of the leftmost and rightmost functions.

BijectionThinWrap(FinFunction(f, args...))

function FinBijection(f::Union{AbstractDict{K,Int},AbstractVector{Int}}) where K
function minandmax(p::Tuple{<:Number, <:Number}, n::Int)::Tuple{Int,Int}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little overzealous for a type assert, the julia compiler will infer this method correctly.

function minandmax(p::Tuple{<:Number, <:Number}, n::Int)::Tuple{Int,Int}
(Int(min(p[1], n)), Int(max(p[2], n)))
end
minval, maxval = reduce(minandmax, values(f), init=(Inf, -Inf))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use typemax(Int), typemin(Int) for this sentinel value, since Inf and -Inf are Float64. You probably also want to return the min, then the max so that you can interpret it as a range. min:max

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will change the sentinel value. About the ordering thing, if I'm not misunderstanding you, minandmax is already returning the values in that order, as is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nevermind that. reading min and max too many times in a row, I forgot which is which.

(f::AbsBijectionWrap)(x) = (unwrap(f))(x)

# Developer's note: should I add in a `known_correct` parameter, asking
# the caller to decide whether the SetFunction should be checked for bijectivity?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that is usually how we enable optional correctness checking.


# Developer's note: should I add in a `known_correct` parameter, asking
# the caller to decide whether the SetFunction should be checked for bijectivity?
@struct_hash_equal struct BijectionThinWrap{Dom<:SetOb, Codom<:SetOb,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thin has a technical meaning in CT, that there is at most one morphism between two objects. Here I think you are just meaning "thin wrapper" like a lightweight wrapper type. It would be good to avoid the word thin for this non-jargon usage.


function Base.show(io::IO, f::BijectionThinWrap)
replacerules = (r"^SetFunction" => "Bijection", r"^FinFunction" => "FinBijection")
print(io, replace(repr(f.func), replacerules...))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to include a test for this so that we can detect breakage. I can imagine changing the "SetFunction" and this pattern not matching any more.

# for `compose_inv`, the obvious idea would be to try refactoring
# `CompositeFunction` to store a list of `SetFunction`s instead of a pair.

function compose_inv(f::SetFunction, g::SetFunction)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be easier with MLStyle pattern matching and structural recursion.

- Rewrite `comp_inv_helper` with MLStyle pattern matching
- Reconceive `BijectionBimap`, allowing it to contain instances of `AbsBijectionWrapper`
- Reconceive `do_inv`, making it return a `Bijection`
- Tighten up `minandmax` definition
- Implement specific `do_compose` for `BijectionBimap`s
@ssteakley
Copy link
Contributor Author

ssteakley commented Sep 6, 2023

Thanks for the comments @jpfairbanks ! For now, I decided not to worry about implementing the known_correct parameter and corresponding checkers; that can wait. I addressed the other comments, and made a couple more changes. If this is looking acceptable, then I could try to proceed with the rest of the code for the permutation groupoid.

Other needed steps:

  • Write more tests/improve coverage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants